Skip to main content

Full text of "Naval research logistics quarterly"

See other formats


Jd-^ 



DEPOSITORY 

T^^C 1979 



NflVflL RfSfflRCH 





o 

r : 

o 



C3 



-4 






I- 
-I 

pi 



DECEMBER 1979 
VOL. 26, NO. 4 




OFFICE OF NAVAL RESEARCH 

NAVSO P-1278 



c/on-3 



NAVAL RESEARCH LOGISTICS QUARTERLY 



EDITORIAL BOARD 

Marvin Denicoff, Office of Naval Research, Chairman Ex Officio Members 

Murray A. Geisler, Logistics Management Institute 
W. H, Marlow, The George Washington University 
Bruce J. McDonald, Office of Naval Research Tokyo 



Thomas C. Varley, Office of Naval Research 
Program Director 

Seymour M, Selig, Office of Naval Research 
Managing Editor 



MANAGING EDITOR 

Seymour M. Sehg 

Office of Naval Research 

Arlington, Virginia 22217 



ASSOCIATE EDITORS 



Frank M. Bass, Purdue University 

Jack Borsting, Naval Postgraduate School 

Leon Cooper, Southern Methodist University 

Eric Denardo, Yale University 

Marco Fiorello, Logistics Management Institute 

Saul 1. Gass, University of Maryland 

Neal D. Classman, Office of Naval Research 

Paul Gray, University of Southern California 

Carl M. W^xris, Mathematica, Inc. 

Arnoldo Hax, Massachusetts Institute of Technology 

Alan J. Hoffman, IBM Corporation 

Uday S. Karmarkar, University of Chicago 

Paul R. Kleindorfer, University of Pennsylvania 

Darwin Klingman, University of Texas, Austin 



Kenneth O. Kortanek, Carnegie-Mellon University 

Charles Kriebel, Carnegie-Mellon University 

Jack Laderman, Bronx, New York 

Gerald J. Lieberman, Stanford University 

Clifford Marshall, Polytechnic Institute of New York 

John A. Muckstadt, Cornell University 

William P. Vitrsk^Mdi, Northwestern University 

Thomas L. Saaty, University of Pennsylvania 

Henry Solomon, The George Washington University 

Wlodzimierz Szwarc, University of Wisconsin, Milwaukee 

James G. Taylor, Naval Postgraduate School 

Harvey M. Wagner, The University of North Carolina 

John W, Wingate, Naval Surface Weapons Center, White 

Shelemyahu Zacks, Case Western Reserve University 



The Naval Research Logistics Quarterly is devoted to the dissemination of scientific information in logistics a 
will publish research and expository papers, including those in certain areas of mathematics, statistics, and economi 
relevant to the over-all effort to improve the efficiency and effectiveness of logistics operations. 

Information for Contributors is indicated on inside back cover. 

The Naval Research Logistics Quarterly is published by the Office of Naval Research in the months of March, Ju 
September, and December and can be purchased from the Superintendent of Documents, U.S. Government Printi 
Office, Washington, D.C. 20402. Subscription Price: $11.15 a year in the U.S. and Canada, $13.95 elsewhere. Cost 1 
individual issues may be obtained from the Superintendent of 'Documents. 

The views and opinions expressed in this Journal are those of the authors and not necessarily those of the Off;' 

of Naval Research. ' 

i 

Issuance of this periodical approved in accordance with Department of the Navy Publications and Prmting Regulatio' 

P-35 (Revised 1-74). 



OPTIMAL SET PARTITIONING, MATCHINGS AND 
LAGRANGIAN DUALITY* 

George L. Nemhauser 

School of Operations Research 

and Industrial Engineering 

Cornell University 

Ithaca, New York 

Glenn M. Weber 

Christopher Newport College 
Newport News, Virginia 

ABSTRACT 

We formulate the set partitioning problem as a matching problem with sim- 
ple side constraints. As a result we obtain a Lagrangian relaxation of the set 
partitioning problem in which the primal problem is a matching problem. To 
solve the Lagrangian dual we must solve a sequence ot" matching problems each 
with different edge-weights. We use the cyclic coordinate method to iterate the 
multipliers, which implies that successive matching problems differ in only two 
edge-weights. This enables us to use sensitivity analysis to mndil'y one optimal 
matching to obtain the next one. We give theoretical and empirical compari- 
sons of these dual bounds with the conventional linear programming ones. 

1. INTRODUCTION 

We consider the set partitioning problem 

n 

max Y d/y/ 

7 = 1 

iSP) Z «,/>'/ = \, i = 1, ... . m 

7 = 1 

y, e{0,\]. j = \, .... n 

where di is an arbitrary real number for all j and o„ € (0, 1} for all / and / Balas and Padberg 
[1] give a survey of applications and methods for solving the set partitioning problem. Except 
for algorithms developed for small size problems, most algorithms for solving the set partition- 
ing problem use linear programming relaxations. However, for large size problems, because of 
degeneracy, the linear programs obtained by replacing the binary restriction on each yj in (SP) 
^y yi ^ often are difficult to solve (Marsten [12]). As a result, the typically large size (and 
sparse) set partitioning problem sometimes cannot be solved. 



*This work has been supported by National Science Foundation Grant ENG75-00568 to Cornell University. 

553 



554 G.L. NEMHAUSER &G.M. WEBER 

We consider a different relaxation that uses matchings on graphs and Lagrangian duality. 
This is accomplished by reformulating the set partitioning problem as a (weighted, perfect) 

m 

matching problem, a version of (SP) in which ^ o,^ = 2 for y = 1, . . . , /j, with simple side 

; = 1 

constraints. The side constraints are incorporated into the objective function in a Lagrangian ! 
fashion, resulting in the primal Lagrangian matching relaxation. The matching problem is one 
of the few tractable combinatorial problems, and thus is an attractive relaxation for large set 
partitioning problems. 

2. LAGRANGIAN MATCHING RELAXATION 

m 

In (SP) let a I = (oi^, .... a,„j) and suppose that for all J, ^ a,y = 2Ki, for some integer 

m 

Kj. This places no limitation on the generality of (SP) since if ^ a,j = IKj — 1 then a new 
constraint y, ^ 1 can be added to the problem. We replace Oj by the set of columns [aj]^L^, 
where ^ af =0;, X ^'5 ^ ^' ^ ^ ^' • • • - ^i ^"d a^i € {0,1}, for all / and k. One can form 

k=\ /=1 

these columns in such a way that both nonzero components of af precede those of af '^', for all 
k. Column a/ is given an objective function coefficient of Cj = dil Kj and is associated with the 
variable Xj^ ^ {0,1}. The [xj^] are required to satisfy Xy^. = x^j^+i, k = \, . . . , A^^ — 1, for all y. 
We thus obtain a problem equivalent to {SP) given by 

max £ Y, ^i^ik 

/ = 1 k = \ 

(MS) ± ia^,x„ = \, i = l, .... m 

/ = 1 k = \ 

x,k - Xi,k + \ = 0, 7 = 1, .... « and /: = 1, ... , K, - \ 
Xji, € {0.1}, ally and k, 

which is a matching problem with side constraints x^^ = Xji,+\. Using matrix notation, thisj 
problem can be written as 

1 

max ex 

Mx = 1 

Sx = 
X binary 

where Mand Sare the coefficient matrices of the matching and side constraints, respectively. 
A solution to {MS) yields a solution for {SP) given in 



PROPOSITION 1: If {x%\ is an optimal solution of {MS) then y) = x% J = \, .... n 
an optimal solution to {SP). 



IS 



Let G{\, x) = {c - kS)x where the domain of x is {x|A/x = 1 and x binary} and X is an 
unrestricted vector of Lagrange multipliers. The Lagrangian relaxation of {MS) relative to 
5x = is 



OPTIMAL SET PARTITIONING 555 

(£^ ) Fix) = max G{k, x) . 

Without matrix notation, the Lagrangian relaxation can be written as 

n ^j 

max £ Y ^0 - ^^jk - ^j.k-\))Xji, 

7 = 1 -^ = 1 

Z Z "'/ -^/^ = 1 ' i = \. ■■■ < m 

7 = 1 A: = l 

Xii, € {0,1}, ally and k 

where, for y = 1, . . . , «, \^o and \ji^ are defined to be zero. {LR^ is a (weighted, perfect) 
matching problem. 

Relaxations play a very important role in integer programming algorithms. To be 
worthwhile, the relaxed problem should be easier to solve than the original one and should also 
yield a tight bound on the original problem solution. Lagrangian relaxations often fulfill both 
of these criteria. Since one of the criteria of a good relaxation is the tightness of its bound, the 
best choice for \ in {LR^) is the one that optimizes the Lagrangian dual 

{LD) min F{k) 

where \ = (X], . . . , \„) and \, = (\,i X^j^- _i), for y = 1, . . . , n. 

Let \{P) represent the optimal objective function value of any problem (P), and let 
(SPLP) represent the linear programming relaxation of (SP). Proposition 2 (see Geoffrion 
[7]) summarizes the relationships between (SP), (SPLP), (MS), (LR,,) and (ID). 

PROPOSITION 2: (a) \(SP) = v(MS) ^ w(LPSP), 

(b) for all X, \(MS) < y(LR>) (=F(X)), 

(c) if for a given X a vector x is optimal in (LR^) and Sx = 0, 
then X is an optimal solution of (MS), and 

(d) w(LD) ^ w(SPLP). 

Note that the Lagrangian relaxation using the X found in (LD) is at least as tight as the linear 
programming relaxation; this is a consequence of the fact that matrix M is not totally unimodu- 
lar. Typically, \(SP) < v(LD) < \(SPLP). 

3. OPTIMIZING THE LAGRANGIAN DUAL 

Many methods (surveyed in Fisher, Northup and Shapiro [6] and Bazaraa and Goode [2]) 
have been proposed for solving Lagrangian duals. By far the most widely used is the subgra- 
dient optimization method described in Held and Karp [8] and Held, Wolfe and Crowder [9]. 
Compared to other methods, very little "overhead" is needed and, most importantly, it has 
proven to be very effective computationally. In subgradient optimization, a sequence {X'} of 
multiplier vectors is generated iteratively, using at each iteration the solution F(\'). Many 
components of each X' change from iteration to iteration, and in the context of solving (LD), 
new optimal matchings must be solved "from scratch." Although solving a large matching prob- 
lem is much easier than solving a large linear programming problem, it still can be time con- 
suming. 



556 G.L. NEMHAUSER & G.M. WEBER 

The (weighted, perfect) matching problem 

max ex 
Mx = 1 
X binary 

where each column of M contains exactly two nonzero entries, both equal to one, can be inter- 
preted graphically by letting M be the node-edge incidence matrix of a graph in which each row 
of M represents a node and each column represents an edge where edge k meets node / if and 
only if m,^ = 1, and q is the weight assigned to edge k. The problem then is to choose a set of 
edges, called a feasible matching, so that each node meets exactly one of the edges selected, in 
such a way that the sum of the weights on the edges chosen is as large as possible. Edmonds 
[3,4,5] developed an efficient (polynomially bounded) primal-dual algorithm for solving the 
matching problem and Weber [14] showed how sensitivity analysis can be performed on optimal 
matchings to get the new optimal solution from the original optimal solution if the weight on an 
edge is changed. Except for some very simple special cases, the techniques involve modifying 
the graph by attaching additional nodes and edges near the edge whose edge-weight is to be 
altered. Edmonds's algorithm is re-entered with all the needed properties, including comple- 
mentary slackness, being maintained. The final primal and dual solutions for the modified 
problem are then "translated back" to yield the optimal matching for the single altered edge- 
weight problem, in such a way that again, all the needed properties are maintained (and thus, 
the process can be repeated if other edge-weights are altered). Reoptimizing using these tech- 
niques when a single edge-weight is altered is on the order of cardinality (A^) more efficient 
than using Edmonds' algorithm "from scratch," where A^ equals the number of nodes in the 
graph. 

Because of the special structure present in the side constraints of {MS) in which each 
variable appears in at most two equations, we choose to attempt to optimize {LD) by using an 
improved version of the cyclic coordinate method of nonlinear programming. The structure of 
the S matrix results in each \ n^ appearing in the coefficient of at most two variables in G{\, x). 
This allows the sensitivity analysis techniques to be used to improve significantly the usual 
cyclic coordinate method. In this method, F{\) is optimized cyclically in each of the coordinate 
directions. Thus, after initializing \, we minimize F{\) with respect to Xn, .... Xjj^ _i, ... , 

n 

^n\' ■ ■ ■ > ^n.K -1 •" that order, one at a time. This process, which involves ^ (A^, — 1) single 

7 = 1 

variable minimizations, is repeated until the objective function stops decreasing. Typically, 
each one-variable minimization is accomplished by one of the iterative or grid type of pro- 
cedures used in unconstrained optimization algorithms. However, because of the special struc- 
ture of the problem. Theorem 1 provides a direct formula for each minimization, thus avoiding 
the time consuming "line searches." The proof of Theorem 1 appears in the Appendix. 

THEOREM 1: Suppose x* is the current optimal matching vector, \* is the current 
optimal Lagrange multiplier vector, \ and X are identical to A. * except for the \,; component, 
and \,^ = l-l-|c-\*S|l and X^^ = -1 - |c - \ * 5| • 1, x maximizes G{k, x) and x max- 
imizes Gik, x). An optimal A. *^* that minimizes F{\) with respect to \ i,, with all other com- 
ponents of \ fixed at their values in \* depends on x*, x or x, and \* as follows: 

CASE 1: If x% = x*^,+i = 1 then \%*= k% . 

CASE 2: If x*^ = x*^+, = then k%*= k%. 



OPTIMAL SET PARTITIONING 557 



CASE 3: If x*^ = 1 and x'^^+i = then 

(a) if Xjk = 1 and x^a+i = then 

X*^»= oo (LD is unbounded and MS is infeasible), 

(b) if Xjk = 1 and x,;t+i = 1 then 

(c) if Xjif = and x^^^+i = then 

(d) if xji( = and Xy^^+i = 1 then 

^j*= ^% + (1/2) [U,, - Xp - (FXi) - Fix*))]. 

CASE 4: If x% = and x*^+, = 1 then 

(a) if Xji^ = and x^/c+i = 1 then 

X*^*= — oo (LD is unbounded and MS is infeasible), 

(b) if Xji, = and Xy /c+i = then 

\*k*=^ji<- (f(^*) -Fix)], 

(c) if Xji^ = 1 and Xy/t+i = 1 then 

X*k*=>^ji<- [Fix*) -Fix)], 

id) if Xjif = 1 and X/y^+i = then 

^**= ^jk - (1/2)[(X*, - X,,) - iFiX) - Fix*))]. 

Theorem 1 is fairly easy to implement. The only unknown quantities in the formulas for 
y^*are FiX) and FiX) and depending on x* at most one of these must be found. Computa- 
onally, the task of finding either one of these quantities is quite simple, since it is not neces- 
iry to solve a new matching problem "from scratch," but only to use sensitivity analysis tech- 
iques to reoptimize the matching with two edge-weights altered. The techniques are applied 
n those edges, one at a time. After computing X**, the two edge-weights are again altered 
sing X *fc*and a new optimal matching is determined using the sensitivity analysis techniques. 

At each step of the cyclic coordinate method a new vector X is generated, differing from 
le previous X by at most one component. Let X' represent the /-th such vector. 

THEOREM 2: Assuming v(MS) exists, the sequence {FCX')} converges. 

PROOF: For all /, F(X'"^') < F(X') and, by Proposition 2, the sequence has a lower 
ound of \iMS). A bounded, nonincreasing sequence has a limit. D 

Let /^denote the limit of {F(X')}. Zangwill [15] and Luenberger [11] give mild restric- 
ons, including Fix) having continuous first partial derivatives and a unique minimum point 
■ong any coordinate direction, that guarantee global convergence of the cyclic coordinate 
lethod. Unfortunately, Fix) violates these restrictions. It is not necessarily true that 
= viLD). In fact, if FiX) = F then X = X might not even be a local minimum, since it is 
tily a relative minimum with respect to the coordinate directions. 



558 



G.L. NEMHAUSER & G M. WEBER 



The following is an example in which the sequence {F(X')} generated by the cyclic coordi- 
nate method does not converge to \{LD). 

Let O5 be the null matrix of order 5 and 



Z = 



Let the A matrix of the set partitioning version of {SP) be the 20 x 25 matrix 



1 


1 


1 


1 





1 


1 


1 





1 


1 


1 





1 


1 


1 





1 


1 


1 





1 


1 


1 


1 



I 




1 














1 














1 














1 

















1 














1 














1 














1 

















1 














1 














1 














1 

















1 














1 














1 














1 

















1 














1 














1 














1 



\ 



Hi 



Let the objective function coefficients corresponding to the first 20 columns of A be 1 and for 
the last 5 columns be 0. The solutions are v(SP) = 2 where >'i = yia = ^22 = yn = >'24 = ^- 
yj = 0, otherwise, wiSPLP) = 5 where yj = 1/4, j = 1, ... , 20 and yj = 0, otherwise, 
\{LD) = 2_where Xio.i = -1/2, Xni = 1/2, X22,i = 1. ^24,1 = ~1. ^jk= 0» otherwise, 
{F(\')) — * /" = 3 using the cyclic coordinate method (established empirically). 



Thus, \{SP) = \{LD) < Fix) < viSPLP). Notice that the bound achieved from solv- 
ing the Lagrangian dual, for which the subgradient method is successful (established empiri- 
cally), and the bound from the cyclic coordinate method are both superior to the one obtained 
by using linear programming. It should be pointed out that in the usual implementation of 
subgradient optimization, global convergence is not guaranteed. 

In addition to avoiding "from scratch" solutions to large matching problems, another 
important reason for choosing the cyclic coordinate method instead of the subgradient method 
is that subgradient optimization lacks an important property of the easy to perform cyclic coor- 
dinate method. In using subgradient optimization, the sequence {F(\')} is not monotone; it 



OPTIMAL SET PARTITIONING 



559 



an take quite a few iterations until any progress is made in minimizing F(k). Since in a 
)ranch-and-bound method we are more interested in getting a close approximation for \{LD) 
n a short period of time than we are in solving it exactly, it seems reasonable to choose a 
nethod that begins showing progress in minimizing F{\) immediately. Actual computational 
omparisons of the two methods are given in the next section. 

I. COMPUTATIONAL RESULTS 

The results of the computational experiments performed only at the initial node of a 
>ranch-and-bound tree are summarized in three tables. Fourteen problems of varying sizes 
v-ere run using the cyclic coordinate method (Table 1) and the subgradient method (Table 2) 
or optimizing the Lagrangian dual (LD), and using linear programming (Table 3) for solving 
he continuous relaxation of iSP). Each problem contains exactly four ones per column, and 
n the tables, each is labeled type S2, Rl or RR. S2 is the example given in Section 2. The 
)ther two types have constraint matrices consisting of a randomly generated portion containing 
i-m/4 columns and a set of w/4 "dummy" columns to insure feasibility. The /-th such 
dummy" column contains ones in rows 4/— 3, 4/— 2, 4/— 1 and 4/, and zeros elsewhere. The 
ibjective function coefficients are zero for these columns. Types Rl and RR differ in the 
objective function coefficients for the other columns. Problems of type Rl have all the 
oefficients equal to one, while problems of type RR have randomly generated integer 
oefficients with values between one and ten. 



TABLE 1 . Cyclic Coordinate Method'' 



Problem 


m X n 


Type 

of 
Data 


Initial 
Value 


Final 
Value 


5x=0? 


Cycle No. 

at 

Termination 


Iterations 


Time(sec.)t 

on 
IBM 370/168 


1 


20 X 20 


Rl 


4.5 


3 


No 


2 


13 


10 


2 


20 X 20 


Rl 


5 


3.98 


No 


4 


30 


15 


3 


20 X 25 


S2 


4 


3* 


No 


13 


67 


4.78 


4 


20 X 50 


Rl 


5 


5 


No 


5 


40 


15 


5 


20 X 50 


Rl 


5 


5 


No 


2 


14 


10 


6 


20 X 50 


RR 


40.5 


35.00 


No 


5 


32 


30 


7 


40 X 40 


Rl 


8.5 


3.80 


No 


2 


22 


20 


8 


40 X 40 


Rl 


9.5 


8.25 


No 


1 


16 


20 


9 


40 X 100 


Rl 


10 


10 


No 


1 


15 


15 


10 


40 X 100 


Rl 


10 


10 


No 


1 


17 


20 


11 


40 X 100 


RR 


84.5 


81.06 


No 


1 


13 


30 


12 


60 X 60 


Rl 


13 


9.70 


No 


2 


29 


60 


13 


60 X 150 


Rl 


15 


15 


No 


1 


18 


60 


14 


100 X 250 


Rl 


25 


25 


No 


1 


5 


60 



*The program for the matching algorithm is given in [13]. 

tCPU time. Integer values indicate arbitrarily set CPU time limits. 

tConverging to within .00001 of 3. 



560 



G.L. NEMHAUSER & G.M. WEBER 



TABLE 2. Subgradient Method 



Problem 


m X n 


Type 

of 
Data 


Initial 
Value 


Final 
Value 


Best 
Value 


Sx = 0? 


Iterations 


Time (sec.) 

on 
IBM 370/168 


1 


20 X 20 


Rl 


4.5 








Yes 


9 


1.45 


2 


20 X 20 


Rl 


5 


.60 


.57 


No 


77 


15 ,- 


3 


20 X 25 


S2 


4 


2 


2 


Yes 


9 


1.27 I 


4 


20 X 50 


Rl 


5 


5.43 


5 


No 


29 


15 1 


5 


20 X 50 


Rl 


5 


5.58 


5 


No 


34 


■ 


6 


20 X 50 


RR 


40.5 


32.94 


32.81 


No 


69 


H 


7 


40 X 40 


Rl 


8.5 








Yes 


10 


5.98 1 


8 


40 X 40 


Rl 


9.5 








Yes 


12 


7.86 H 


9 


40 X 100 


Rl 


10 


18.33 


10 


No 


13 


20 


10 


40 X 100 


Rl 


10 


21.24 


10 


No 


12 


20 


11 


40 X 100 


RR 


84.5 


105.06 


84.5 


No 


17 


30 


12 


60 X 60 


Rl 


13 








Yes 


16 


37.10 


13 


60 X 150 


Rl 


15 


34.15 


15 


No 


11 


60 


14 


100 X 250 


Rl 


25 


106.53 


25 


No 


5 


60 



TABLE 3. Linear Programming'f 



Problem 


m X n 


Type 

of 
Data 


Final 
Value 


Optimal? 


Binary? 


Iterations 


Time (sec.) 

on 
IBM 370/168 


1 


20 X 20 


Rl 





Yes 


Yes 


19 


.29 


2 


20 X 20 


Rl 





Yes 


Yes 


24 


.34 


3 


20 X 25 


S2 


5 


Yes 


No 


20 


.31 


4 


20 X 50 


Rl 


5 


Yes 


No 


42 


.58 


5 


20 X 50 


Rl 


5 


Yes 


No 


78 


.80 


6 


20 X 50 


RR 


29.90 


Yes 


No 


45 


.73 


7 


40 X 40 


Rl 





Yes 


Yes 


54 


.65 


8 


40 X 40 


Rl 





Yes 


Yes 


57 


.74 


9 


40 X 100 


Rl 


10 


Yes 


No 


308 


5.18 


10 


40 X 100 


Rl 


10 


Yes 


No 


350 


6.18 


11 


40 X 100 


RR 


71.02 


Yes 


No 


156 


2.76 


12 


60 X 60 


Rl 





Yes 


Yes 


93 


1.23 


13 


60 X 150 


Rl 


15 


Yes 


No 


516 


13.44 


14 


100 X 250 


Rl 


<25.44 


No 


— 


1209 


60 



tFORTRAN code given in Land and Powell [10]. 



OPTIMAL SET PARTITIONING 561 



In Table 1 a distinction is made between a cycle and an iteration. Each time F(X) is 
linimized with respect to all of Xn, .... \i /j^^^i, . . . , \„i, . . . , ^n.K„-\ ^^ that order, one at a 
me while the others are fixed, a cycle is completed. However, if when minimizing F(k) with 
jspect to say Xj^ we have that Xji^ ^ ^j.k+i then this is considered an iteration. Thus, there are 
otentially as many as m/2 iterations per cycle. Loosely speaking, a cycle in the cyclic coordi- 
ate method corresponds to an iteration in the subgradient method. 

Very seldom does an algorithm perform uniformly better than another on all problems, 
nd the three methods tested are no exception to this rule. Each out-performs a competing 
lethod on at least one of the fourteen problems tested. However, certain general observations 
an be made. The cyclic coordinate method performs much slower than anticipated, although it 
oes do better than the subgradient method on problem 11. Not surprisingly, linear program- 
ling was highly successful in all randomly generated problems except problem 14, the largest 
ne, in which it was inferior to the other two methods. For this problem, linear programming 
iiled to reach an optimum in one minute, while the other two methods were each able to pro- 
ide useful information since several matching problems were able to be solved in one minute. 

We are not discouraged by the fact that the linear programming method out-performs the 
yclic coordinate and subgradient methods on the majority of the test problems. The results of 
roblem 3 indicate that there could be a class of problems in which, regardless of the size, the 
>'clic coordinate and subgradient methods are superior to linear programming, and perhaps 
lore importantly, problem 14 indicates that perhaps for large problems the methods developed 
ere could be a viable alternative to those algorithms that use linear programming. 

CKNOWLEDGMENT 

We would like to thank Jack Edmonds for many helpful suggestions, particularly with 
jgard to sensitivity analysis of the optimal matchings. 



[1 



REFERENCES 

Balas, E. and M.W. Padberg, "Set Partitioning," pp. 205-258 in B. Roy, ed., Combinatorial 
Programming: Methods and Applications, (D. Reidel Publishing Co., 1975). 

Bazaraa, M.S. and J.J. Goode, "A Survey of Various Tactics for Generating Lagrangian 
Multipliers in the Context of Lagrangian Duality," School of Industrial and Systems 
Engineering, Georgia Institute of Technology (1974). 

Edmonds, J., "Path, Trees, and Flowers," Canadian Journal of Mathematics 7 7, 449-467 
(1965). 

Edmonds, J., "Maximum Matching and a Polyhedron with 0,1-Vertices," Journal of 
Research of the National Bureau of Standards 69fi, 125-130 (1965). 

Edmonds, J., "An Introduction to Matching," notes on lectures given at Ann Arbor, 
Michigan (1967). 

Fisher, M.L., W.D. Northup and J.F. Shapiro, "Using Duality to Solve Discrete Optimi- 
zation Problems: Theory and Computational Experience," Mathematical Programming 
Study i, 56-94 (1975). 

Geoffrion, A.M., "Lagrangian Relaxation for Integer Programming," Mathematical Pro- 
gramming Study 2, 82-114 (1974). 

Held, M. and R.M. Karp, "The Traveling-Salesman Problem and Minimum Spanning 
Trees: Part II," Mathematical Programming 7, 6-25 (1971). 

Held, M., P. Wolfe and H.P. Crowder, "Validation of Subgradient Optimization," 
Mathematical Programming 6, 62-88 (1974). 



562 G.L. NEMHAUSER & G.M. WEBER 

[10] Land, A.H. and S. Powell, Fortran Codes for Mathematical Programming: Linear, Quadratic 
and Discrete (John Wiley and Sons, 1973). 

[11] Luenberger, D.G., Introduction to Linear and Nonlinear Programming, (Addison-Wesley, 
1973). 

[12] Marsten, R.E., "An Algorithm for Large Set Partitioning Problems," Management Science: 
20, 774-787 (1974). 

[13] Weber, G.M., "A Solution Technique for Binary Integer Programming Using Matchings 
on Graphs," Ph.D. Thesis, Cornell University (1978). 

[14] Weber, G.M. , "Sensitivity Analysis of Optimal Matchings," TR No. 427, School of Opera- 
tions Research and Industrial Engineering, Cornell University (May 1979). 

[15] Zangwill, W.I., Nonlinear Programming: A Unified Approach (Prentice-Hall, 1969). 



k 



APPENDIX 



PROOF OF THEOREM 1: The proof of Case 2 parallels Case 1 and Case 4 parallels Case 
3; thus the proofs of only Cases 1 and 3 are given. 

Throughout the proof we use the fact that a change in X % to \y^ changes the objective 
function coefficients of x,^ and x^^^+i in {LR^) by A. *;f — Xj^ and \,/^ — \ %, respectively. 

CASE I: By definition F{k**) = max G{\**, x) ^ G{k**, x*). Since x% = x*t+i = 1, 

X 

for any X identical to X* except for the Xj^ component, G(X, x*) = G(X* x*) = Fik*). In 
particular, G(X**. x*) = F(X *) so that Fik**) ^ Fik*). Now Fik**) = min [Fik): X = X* 

except for the k^i, component} < Fik*). Hence Fik**) = Fik*) and X** = X*. 

CASE 3: Let X = X* and consider continuously increasing Xy^ from k*i^ and altering the 
matching x* only if Gik, x) would increase by doing so. Let kji^ be the value of kn^ when Xji^ 
first becomes 0, if such a X^^ exists, otherwise set X^^ = <». Let X,;^ be the value of X^^t when 

x^^+i first becomes 1, if such a X,^ exists, otherwise set X,^ = °°- Let a = min(X/^., X^^) and 
b = max(X,7,, X,^). Note that X^^; has been chosen sufficiently large so that if a or 6 are finite, 
they are smaller than X^;^. Thus 






X,L = 



k,^ finite 

1 otherwise "^'''+' 



otherwise 
^ kfi, finite. 



As long as x,i^ = 1, there is a unit decrease in the objective function per unit increase in 
X,^ and when x,(^+l = 1 there is a unit increase in the objective function per unit increase in 
X/^. Thus, 

-1, • if ^% ^ ^jk < a 

0, if a < kjk < b 

1' \{ b < kj, ^ kj,. 



(1) dFik)/dkj, = 



In Case 3(a), we have a = °° so that Fik) decreases monotonically with X^^ and the dual 
is unbounded ik**= °°). 



OPTIMAL SET PARTITIONING 563 

In Cases 3(b,c), a is finite and b = oo. We can set K**= a. From (1), Fi\) - F(X*) = 
-{a - \jk) so that 

X*;= a =\% + Fix*) - Fix). 

In Case 3(d), a and b are finite. We can set \** = ia + b)/2. From (1), Fix) - 
Fix*) = -ia - X%) + iXfk - b) so that 

X%*= ia + b)l2 = iX% + X,, + Fix *) - FiX))/2. D 



I 



w 



A COMPLETE IMPORTANCE RANKING FOR COMPONENTS 
OF BINARY COHERENT SYSTEMS, WITH 
EXTENSIONS TO MULTI-STATE SYSTEMS 

David A. Butler 

Oregon State University 
Corvallis, Oregon 

ABSTRACT 

Means of measuring and ranking a system's components relative to their 
importance to the system reliability have been developed by a number of au- 
thors. This paper investigates a new ranking that is based upon minimal cuts 
and compares it with existing definitions. The new ranking is shown to be 
easily calculated from readily obtainable information and to be most useful for 
systems composed of highly reliable components. The paper also discusses ex- 
tensions of importance measures and rankings to systems in which both the 
system and its components may be in any of a finite number of states. Many 
of the results about importance measures and rankings for binary systems are 
shown to extend to the more sophisticated multi-state systems. Also, the 
multi-state importance measures and rankings are shown to be decomposable 
into a number of sub-measures and rankings. 

j Given a system composed of many components, a question of considerable interest is 
which components are most crucial to the proper functioning of the system. In response to this 
question, a number of importance measures and rankings have been proposed [3], [4], [5], 
[10]. This paper investigates a new ranking and compares it to existing rankings, principally the 
ranking induced by the Birnbaum reliability importance measure. The new ranking is based 
upon minimal cuts and provides a complete ordering of all the system's components relative to 
their importance to the system reliability. This ranking has three key points in its favor: (i) the 
;alculations involved require only readily obtainable information; (ii) the calculations are usu- 
ally quite simple; and (iii) the ranking is designed for use with systems consisting of highly reli- 
ible components, the most common case. 

The final section of the paper deals with extensions of importance measures and rankings 
:o systems in which both the system and its components may be in any of a finite number of 
itates. Many of the results about importance measures and rankings for binary systems esta- 
Mished in preceding sections are shown to extend to the more sophisticated multi-state systems. 
Mso, the multi-state importance measures and rankings are shown to be decomposable into a 
lumber of sub-measures and rankings. 

jl. DEFINITIONS OF COMPONENT IMPORTANCE-BINARY SYSTEMS 

Consider a binary coherent system of n independent components with structure function 
Hx) and reliability function h{p). (Our definitions, terminology and notation regarding binary 

coherent systems follow those of [2]0 

565 



566 



DA. BUTLER 



Birnbaum [4] and Barlow and Proschan [3] have each proposed rehabihty importance 
measures, so called because they make use of probabilistic information about the components. 
The Birnbaum reliability measure of the importance of component / is Ii,(i\p) = bh{p)lbp,. 
The Barlow-Proschan reliability measure requires the time-to-failure distribution for each com- 
ponent, and the importance of a component / using this measure can be interpreted as the pro- 
bability that component /causes the system to fail [3]. 

The two sets of authors have also proposed structural importance measures, so called 
because they require only a knowledge of the system structure function to be calculated. This 
feature gives them an important practical advantage over the more sophisticated reliability 
importance measures, because often the more detailed knowledge required for the calculation 
of these latter measures is unobtainable. Both structural measures can be derived from the 
Birnbaum reliability importance measure, assuming a common reliability p for all components. 
Specifically, the Birnbaum structural importance measure is the Birnbaum reliability importance 
measure evaluated at p = 0.5 [4]. The Barlow-Proschan structural importance measure [3] is 
the "average" (integral) of the Birnbaum reliability measure as p ranges over [0,1]. But for 
most systems the typical component's reliability is not 0.5 or even 0.5 "on the average," but 
rather is much higher. This is especially so for systems with complex structure functions incor- 
porating redundancy, because redundancy in the design of a system is usually only incorporated 
if a non-redundant design with highly reliable components cannot produce a satisfactory level of 
system reliability. Thus using either of these two measures to compare components of such 
systems may give a misleading picture of which components are most important. It would seem 
desirable, therefore, to develop a measure or ranking that is structural (i.e., is based solely 
upon the system structure function and therefore not upon p), yet is somehow related to the 
Birnbaum reliability importance measure for high values of p. The ranking proposed in this 
paper is such a result. This ranking is based upon cuts and provides a complete ordering of all 
components. It extends an earlier ranking which provided only a partial ordering of the com- 
ponents [5]. 

To introduce this ranking, consider the following example. ' 

EX AMPLE 1. 




i 



<f>{x) = [1 - (1 - X,)(l - X5)] • [1 - (1 - X2)(I - X3)] • 

[1 - (1 - X2)(l- Xft)] • [1 - (1 - X4)(I - X5)(l - Xe)] 
[1 - (1 - X3)(l - X4)(l - X5)] • [1 - (1 - X,)(l - X2)] 



Mincuts: Ci = {l,5), C2={2,3}, C3={2,6} 

C4= (4,5,6), C5 = {3,4,5}, Ce = {1.2}. 

Assuming that the components function or fail independently of one another and assuming a 
common reliability p for each component, the Birnbaum reliability importances can be written 
as 



IMPORTANCE RANKING FOR COMPONENTS 567 

7,(1; p) = 2(1 -p) - 3(1 - p)' - (1 - py + 3(1 - p)' - (1 - p)^ 

/;,(2; p) = 3(1 -p) - 4(1 - p)' - (1 - p^ + 3(1 - ;;)' - (1 - p)^ 

4(3; p) = (1 - ;,) - (1 - ;7)2 - 2(1 - p)' + 3(1 - p)' - (1 - p)^ 

7,(4; ;;) = 2(1 - p)' - 5(1 - p)' + 4(1 - pV - (1 - p)' 

7,(5; ;,) = (1 - ;,) + (1 - p)2 - 5(1 - p^ + 4(1 - p)' - (1 - /7)5 

7,(6; p) = i\-p) -(\ - p)' - 2(1 - py + 3(1 - ;>)^ - (1 - p)^ 

Denote the component orderings induced by the Birnbaum structural importance measure, the 
Barlow-Proschan structural importance measure, and the Birnbaum reliability importance meas- 
^are by > bS' >bpS' >s/?, respectively. (Note that >g/^ depends implicitly upon p.) Then 

2 > g5 5 > g5 1 > 55 3 = g5 6 > fi5 4 

: 2 > Bps 5 > Bps 1 > BPS 3 ^BPS 6 > bps 4 

■and 

2 >BR S, 1 > BR ^ =BR ^ > BR '^ M all £ = {p,p p) . 

For p < (-1 + V5)/2 = .618, 5 >s« 1 and the three rankings are indentical. But for 

ip > (-1 + V5)/2, 1 >BR^- 

Notice that if the Birnbaum reliability measures 7/, (/; p) are written as polynomials in 
'(1 — p) as above, then for high values of p the lowest-order terms in the polynomial dominate 
'the rest. Thus, in this example, by looking only at the lowest-order terms in the formulas for 
;//,(/; p) it is apparent that for high values of p, 

2 >fl/j 1. 1 >s/? 5, 1 >fi;; 3, 1 >5/j 6, 

5>s;;4, 1>BR^. 6>fi^4. 

By examining the lowest- and the second lowest-order terms, we can further determine that for 
high values of p 

5 >fl« 3 and 5 >a/; 6. 

This suggests a possible way to define a new structural ranking that would agree with > br for 
high values of p. It will be more convenient to define this new structural ranking in terms of 
the system's min cuts, rather than the coefficients in the polynomial expressions for 7/,(/; p). 
However, we will see that the resulting definition is equivalent to the above. 

DEFINITION: For each component /of a coherent system (N, 0) with ? minimal cuts, let 
rfj" denote the number of collections of / distinct min cuts such that the union of each collec- 
tion contains exactly j components and includes component /, (1 ^ / < r, 1 ^ 7 ^ «). Let 

V" = Y.^-\)'~^cl^/\ Let ^*" = (6/'\ ..., 6„*"). Component / is more cut-important than 

:omponent /c, denoted / > ^. k, if and only if 6*" )> ^*''\ where )» denotes lexicographic order- 
ing. Components /and k are equally cut-important, denoted I = , k, if and only if ^*" = ^"'*. 

This definition, although rather formidable in appearance, is in practice usually easy to 
apply and work with. Because the ranking depends upon a lexicographic ordering, most com- 
ponents can be ranked by determining only the first few components of b}\ Also, the first 
ion-zero component of any 6* ' is particularly easy to compute (see Proposition 1 and Corollary 
1). 



568 DA. BUTLER 

EXAMPLE 1. (continued) 

For / = 2, the non-zero d^/^'s are as follows: d{l^ = 3, d{f = 4, d^}^ = 4, d^f = 4, 

J(2)=3, rf(2)=ll, ^(2) =5, ^(2) =4^ ^4'6^'=11. ^i6^>=6. C/<6^>=1. ThUS b''^ = 

(0,3,-4,-1,3,-1). Similarly, 

6<" = (0,2,-3,-1,3,-1), 6^^> = (0,1.-1,-2,3,-1), 
b}'^^ = (0,0,2,-5,4,-1), 6"^ = (0,1,1,-5,4,-1), 
6^6^ = (0,1,-1,-2,3,-1). 

i 
Therefore 2 > , 1 > , 5 > , 3 = , 6 > ^ 4. * 

Note that ^*'' is the vector of coefficients in the polynomial expression for I^ij; p). We will 
show that this is so in general. Also note that the cut-importance ranking and the Birnbaum 
reliability importance ranking for high values of p are in agreement. 

2. ANALYSIS OF THE CUT-IMPORTANCE RANKING-BINARY SYSTEMS 

As stated in the introduction, the cut-importance ranking has three main favorable pro- 
perties: (i) it is based upon readily obtainable information, (ii) is usually easily calculated, and 
(iii) is designed for use when component reliabilities are high. The first property is already 
established, since this ordering is based only upon the system structure function through the 
minimal cuts of the system. This section will deal with the second and third properties. 

The precise meaning of the third property of the cut-importance ranking is given in 
Theorems 1 and 2 below. The first theorem relates the cut-importance ranking to the Birn- 
baum reliability importance measure in the case where the component reliabilities are equal and 
high. 

THEOREM 1: For p= ip,p, ..., p) where the scalar p is sufficiently close to one, the 
orderings >a and >c are identical. 

PROOF: The above is a direct result of Lemma 1 which follows. Using the lemma, it is 
clear that I =^ k if and only if /,,(/; p) = Ii,{k,p) for all p. [A scalar second argument is used 
in Ii,ik; p) when p= ip, ....p).] Also, I >, k if and only if ^*" - ^<*' > 0, and 
^(/) _ ^(k) \, g if and'only if //,(/; p) > I/,(k; p) for all p sufficiently close to one. D 

LEMMA 1: /^^(/. p) = £^^(/) (/ _ ^)7-i. 



PROOFS hip)=Pr 



I 



c^E. 



where £, denotes the event that at least one component in /''' min cut functions. Thus 



hip) = \- Pr 



U El 



By the inclusion-exclusion principle ([8]; pp 98-101), 

hip) = 1- t(-l)'-'5„ 



IMPORTANCE RANKING FOR COMPONENTS 



569 



ivhere 



Sow the event Ej^ f] E-^ f] ■•■ p) EJ^ is the event that all components k € C,^ [J Cj^[J-- 
\J Cj^ fail. 



rhus using the independence assumption, 

K.yi<./2<--</,<' 



n 



K^c,^Uc,^U Uc, 



(1 - P.) 



vhere C\, . . . , C, are the minimal cuts of the system. Thus 
bhip) 



hM\p) = 



'ec,^ U c-,^ U U c,^ 



n 






(1 - p,) 



lecalling P\ = P2 = ■■■ = P„ = P-, and the definition of df-'' , 



' = ly = l 

= ±b^'H\-pV-\ 

7 = 1 



'(I) 



n 



Theorem 1 establishes the relationship the cut-importance ranking has to the Birnbaum 

eliability importance ranking in the case of high and equal component reliabilities. We now 

onsider the case where the component reliabilities are high but unequal. Let pie) be a vector- 

'alued function of the positive scalar e for which < p,(€) < 1 for all e € (0,oo) and 

^ / < «. Let lim p{e) = 1. Unfortunately, it is not true in general that the component ord- 
€—0 — ~ 

■ring induced by //,(;p(e)) coincides with >^ for all e sufficiently close to zero. However, 
vith some additional assumptions on pie) some partial results along these lines are possible. 
Mrst we establish a simple and computationally convenient formula for the first non-zero coor- 
iinate in any vector ^*". 

PROPOSITION 1: For each component /c, let e,, be the cardinality of the smallest 
ninimal cut containing component /c, and let /^. be the number of minimal cuts of cardinality 
'k containing k. Then (i) ei^ = m\n[j:bl'^^ 7^ 0}, and (ii) f^ = */*'. 

PROOF: By definition ^z**' = /^. Also any union of two or more minimal cuts at least 
me of which contains k must have cardinality at least e^ + 1. Thus ^(j** = for all / ^ 2. 
'herefore 



be''' = I (-!)'-'<' =/.• 



/=! 



ilso, since component k is contained in no cuts of cardinality smaller than e^, (/,J'^^ = for all 
< e^. Thus ft/*^* = for j < e^. 

D 



570 DA. BUTLER 

COROLLARY 1: (i) If e, < e,,, then / > , A:. .^ 

(ii) If e, = ei, and /, > f^, then I > , k. 

THEOREM 2: Assume that for some Mj, M2 € R, 

1 9i(«) 1 

— — < — 7^ < — — for all sufficiently small e. 
M, 9^(e) M2 

If either (i) e, < e,,, or (ii) ei = e^ and ifilfk) > {MjMj) * , then there exists an e > 
such that I,,il;p{€)) > I,,(k;p(e)) for all e < e. 

PROOF: See Theorem 2 in [5]. Further results along these lines are surely possible, but 
their value is questionable because the hypotheses become too complex. From a practical 
standpoint, users of the cut-importance ranking should be aware that while the cut-importance 
ranking can be useful even when component reliabilities are unequal, it may be misleading if 
the differences in the orders of magnitude of the unreliabilities are too great. 

To summarize, the reliability importance measures and rankings probably give generally 
superior results to the structural ones and should be used unless (i) the probabilistic informa- 
tion required for their calculation is not available or (ii) computations involved are prohibitively 
extensive. However when one or the other of these conditions prevails, a structural measure or 
ranking must be employed. The Birnbaum or the Barlow-Proschan structural measure can be 
used but we have seen that doing so is equivalent to using the Birnbaum reliability importance 
measure //, (/;p) with p = 0.5 or 0.5 "on the average." If one feels that the component reliabili- 
ties, although not precisely known, are high, then the cut-importance ranking seems preferable, 
since, as Theorems 1 and 2 have shown, its results are the same as those given by the Birn- 
baum reliability importance ranking for high values of p. 

We now turn to the question of the computational complexities involved in determining 
the cut-importance ranking of a system's components. It is clear that the task of computing the 
entire vector ^"'* for each component k can be a formidable one for a complex system with 
many minimal cuts. However, Proposition 1 and Corollary 1 show that components often can 
be compared by only determining the easily computed quantities e^. and fi^. For instance, in 
Example 1 it is possible to determine that 

2 >, 1 > , 5, 3, 6 > < 4 

in this manner. Also, since the structure function is symmetric in x^ and X(„ it is clear thai 
3 = , 6. Thus additional calculations are necessary only to compare components 3 and 5. The 
ordering of these two components can be determined by computing the next entries in ^*^' and 
b^^\ namely 63"' and b^^K The last three entries in each vector ^' * are irrelevant for the pur- 
poses of ranking the components in this example. 

In general, most components can be compared by determining the first non-zero entry in 
^' ' via Corollary 1. Other entries in ^* ' are computed only as necessary. 

Computations can also be simplified when the system under consideration contains 
modules. 

PROPOSITION 2: Let {A,x) be a module of iN,ct>) and let <p{x) = .//(x(x^),x^'). Let 
*^* ', ^b^\ and '''^* * be the ^' * vectors corresponding to the structures 4>, x, and t//, respec- 
tively. Then 



IMPORTANCE RANKING FOR COMPONENTS 

*^/''' = Z *^/" ■ ""^l-'+i for ^'l k e A. 



571 



vhere the definition of ^b^^ is extended to include zero coordinates for / > \a\, and '''^*" is 
extended similarly. (The above equation is just an expression of the fact that '*^"^' is the con- 
/olution of the finite sequences '^^*" and ^^**'*.) 

PROOF: In the remainder of the paper the dependence of //,(/;/?) upon p will at times be 
suppressed and the notation simplified to //,(/). Let //(), //^(O, and I,f(-) denote the respec- 
ive Birnbaum reliability importance measures. These three quantities are related as follows [4]. 



lit(k) = I,fil) ■ iMk) for all k e A. 



fhus by Lemma 1, 



f^'^bl'^l-p)'-^ = 



Ml + l 



2: *Z)/"(i-p)'-' 



\A\ 






= 1 

/ = 1 I 



F^'" • ^^AV, 



(1-p) 



/-I 



)ince this equality holds for all < ;? ^ 1, each pair of coefficients of the two polynomials 
nust be identical. 

n 

Proposition 2 can be applied to make the calculation of the cut-importance component 
anking simpler when the system contains modules. 

I EXAMPLE \. (continued) Components 3 and 6 form a module. 

A = {3,6}, xix") = x^Xf,, iliiz.x"*') = !-(!- x,X2X4)(l - x,z)(l - X2X5). 
^i6"' = (1,-1), ^b^^^ = (1,-1), *6<" = (0,1,0,-2.1). 

By using the concept of the dual of a coherent system, it is possible to develop a com- 
)onent ranking analogous to > ,., but based upon minimal paths instead of minimal cuts. This 
)rdering can be shown to be identical to the ordering induced by //,(;/?) when p is sufficiently 
mall. 



I. COMPONENT IMPORTANCE IN MULTI-STATE SYSTEMS 

This section deals with extensions of the results of the preceding sections to systems in 
vhich both the system and its components may be in any of a finite number of states. Of 
curse, any such increase in the sophistication of the model used to represent a real system 
:ntails disadvantages as well as advantages, and this extension to multi-state models is not 
ntended to suggest that binary models are generally inadequate. To the contrary, in most cases 
hey suffice quite well. However, in some instances a small increase in the number of states 
say, to three or perhaps four) can result in a much improved model. 

One of the main difficulties with multi-state models is the increased notational complexity, 
^''or this reason and for the reason that the number of states in a practical model must be kept 
mall if the model is to be manageable, the following definitions will be given for ternary 
three-state) systems; however, they will be given in a manner that illustrates the extension to 



572 DA. BUTLER 

general /j-state systems. Whenever the extension of a definition or result to w-state systems is 
unclear, some further explanation will be given. 

The study of multi-state systems is a relatively new area in reliability theory. Most articles 
in this area have dealt with generalizing particular classes of results [1], [7], [9], [11], [12]. The 
most general paper in the area is Barlow's [1]. Let Xj denote the state of component 
j, {Xj = 0, 1,2, 1 ^ y < n). Given a collection of minimal cuts C\, Cj, ■■■ , C, which define 
the system structure, Barlow defines the system state (fiiX) as the state of the "best" component 
in the "worst" min cut, i.e., ^ 

0(A') = min {max {A',}}. ^ 

Let Zj = I[x ^k] ^nd let i// = I\^(x)^k]- Both Z and »// are binary, and t// is a function only of Z. 
Because of this property, most results about binary coherent systems have immediate generali- 
zation under Barlow's extended definition. However, there are many more reasonable choices 
for the structure function in a multi-state setting than Barlow's definition allows (see [6] for 
some examples). To accommodate such choices, a more general definition of a multi-state 
coherent system is proposed below. 

Let 5 = (x G IR":x, = 0,1,2), and let (•,,x) = (x,,jc2, .... x,_i,-, x,+,, ..., x„). 

DEFINITION: Component / is relevant if and only if 0(2, ,x) ^ 0(0, ,x) for some x 6 S. 
Otherwise component / is irrelevant. Component / is fully relevant if and only if 
0(2, ,x) j^ 0(1, ,x) for some x € Sand 0(1, ,y) ^ 0(0, ,j') for some y ^ S. 

DEFINITION: A structure function is coherent if and only if 

(i) 0(0) = 0; 0(2) = 2, 

(ii) 0(x) is non-decreasing in x, 

(iii) each component is relevant. 

The ordered pair (M 0) is called a (generalized or ternary) coherent system. 

If a component is not fully relevant, then only two states are required to describe its 
status. Such components are permissible in a generalized coherent system to allow for a mix- 
ture of binary and ternary components. 

Define the matrix P = [p^j] by 

Pii = Prfcomponent / is in state j], 1 ^ / ^ /j, < y ^ 2. 

The reliability function, h{P), is defined by 

h{P) = Fr{0'(X) ^ m], 

where < w ^ 2. The effect of this generalized definition of the reliability function is to con- 
sider systems whose components have several states but whose structure function effectively 
has two states (^w or <w). All subsequent definitions and results are for a fixed value of m. 
(For simplicity, the dependence of /?(•) upon m is suppressed in the notation.) For any matrix 
A = [a,y], let (,ki,A) denote the matrix whose i-j\h entry is given by 



IMPORTANCE RANKING FOR COMPONENTS 573 



{ki,A)„ = 



«'/ / ^ I, 

1 I = I, J = k 

i = I, J ^ k. 



DEFINITION: The r,s reliability importance of component i, denoted by I!;'{i\P), is given 

/r(/; P) = h{r,,P) - h{s,,P), 

vhere r,s =0,1,2 and r > s. The 2,0 reUabiUty importance will sometimes be simply called 
he reliability importance and be denoted by //,(/; P). The r,s reliability importance of com- 
)onent / is the probability that the system is in state m or better given component / is in state r 
ninus the probability that the system is in state m or better given component / is in state 5. 

DEFINITION: A vector x 6 S is r,s critical for component i if and only if (f>ir,,x) ^ m and 
bis,,x) < m. (r,s = 0, 1,2; r > 5) 

DEFINITION: Let w^^ = |{x € 5: x is r,s critical for component /}|. The r,s structural 
mportance of component i, /JiM'), is given by 

|n the following, whenever a vector p € JR.^ appears in an expression normally involving the 
natrix P, P will be understood to be the matrix all of whose rows are equal to p. 

PROPOSITION 3: (i) /,20(/) = /^i(/) + /,i.0(,) 

(ii) /2.0(/) = /2.1(/) +/^0(/) 

(iii) /^M/) = /r(/; (1/3,1/3,1/3)) 

PROOF: The proofs of (i) and (ii) are trivial. To prove (iii), note that by summing over 
lie 3""^ possible values for Xi^, k ^ i, 

h(J,. (1/3.1/3,1/3)) = 3-"+' £ /«(<*o,..)>,„l 

x,=./ 
= 3"" ^ /"(<JO-.x)>ml 

xe5 

'here in the above, In^ denotes the indicator function of the set A. Thus 
/;,2'(/;(l/3,l/3,l/3)) = 3-" X [/«(^(2,,.)>,„i - /«i^(i, ... )>.,)] 

x65 

= 3-"nyii) = llHi). 

the proof for /^"^ (•) is the same and the proof for I^^ (•) follows from parts (i) and (ii). 

I D 

Parts (i) and (ii) of the above result show that both 2,0 importance measures decompose 
j|Uo the sum of the 2,1 and 1,0 importance measures. The generalized cut-importance ranking 
|) be defined later has a similar property. In practice, it is likely that the 2,0 measures and 
inkings would be the most commonly used. However, the other measures and rankings can 



574 DA. BUTLER 

be useful in providing more detailed information about which states are most relevant in deter- 
mining a given component's ranking. (See Example 2.) 

Given a generalized coherent system {N,(f>), and a partition C = (Cq, Ci, C2) of A^ into 
three sets, define xiO € Sby 

'• € Co 
(x(e)),= 1 / e C, 
2 / e C2. 

The function x((2) shows how any partition (2 determines the states of all the components. 

DEFINITION: A partition C = (Cq, C,, Cj) of A^ is a cut if and only if <^(x(C)) < m. 
A cut C is a minimal cut if and only \f <t>{y) ^ w for all 3^ € S such that y ^ xiC) , y ^ x(C). 

While it is in principle possible to develop a complete cut-importance ranking for general- 
ized coherent systems, in practice the calculation of the entire generalized 6* ' vector for each 
component is too complex to be feasible. However, a partial ordering of the components which 
involves very few calculations can be developed by generalizing Proposition 1 and Corollary 1 
appropriately. First, the notions of the size of a partition and the union of partitions must be 
defined. 

■ii 
DEFINITION: The size of a partition C = (Cq, C,, C2), denoted by z((f), is 

"oIQI + «i|Ci|. (ao' "1 ^re arbitrary constants satisfying ao > aj > 0.) The roles of the 

constants ao, ai are dicussed later. 

DEFINITION: Let J', J^ J' be partitions of A^ where |tt 

T= in. T[, T'2). 
The union of J', .... J' is the partition 

where V,= \J T,. 

/=i 

DEFINITION: Consider a ternary coherent system with minimal cuts 

C' = (Cq, C\, C2), / = 1, 2 t. For each component /c, let 

?(+'•'= min {z((?'):A: € C^ - a,. 

and let 

/r''^= \{(^':k € Q.ziC) -a, = er'^ 1 ^ ' <t\\- 

(By convention e^^' ' = + 00 if /c ^ C;, 1 ^ / < r ; el,^^-' is just the size of the "smallest" min 
cut which contains k in the r^^ set of the partition less a,, and Z^"^' '^ is the number of such min 
cuts.) For all r.s = 0,1,2 such that r > 5, define 

€[■'= min {e^"+'"} 
and 



s ^u <r 



■U + \.U 



IMPORTANCE RANKING FOR COMPONENTS 



575 



Component / is more r,s cut-important than component k, denoted / > ^ ^ k, if and only if either 

(i) er < el^\ 



or 



(ii) er = er and f^ > f^ 



The 2,0 cut-importance ranking will sometimes be simply called the cut-importance ranking and 
be denoted by >(.■ As in the binary case, each of the r,s importance rankings is consistent with 
the ranking induced by the corresponding r.s importance measure. To be more specific, let /^(c) 
= (e"°, e"',l — e"" — e"'). As e approaches zero, pie) puts almost all its mass on the best 
state, state 2. Of the mass left over, the ratio of the'mass put on state to that put on state 1 
approaches zero. Thus the parameters ao and ai give the relative weights put on components 
in state zero versus components in state 1 in the cut importance ranking and also determine the 
relative likelihoods of a component partially failing (state 1) and fully failing (state zero). 

THEOREM 3: For e sufficiently close to zero, the component ranking induced by 
ir{-\p{e)) is consistent with the r,s cut-importance ranking (r > 5). 



PROOF: U,^^U\pU)) = hiir + l),,pU)) - h{r,,pU)) 



= \ - Pr 



U E,\X, = r + \ 



1 - Pr 



U E,\X, = r\ 



A'here E, = [X ^ x{C')}. Thus by the inclusion-exclusion principle, 



vhere 



ir'U;pie)) = £ (-1)'-' [S;- 5;+'], 



5; - sr' = z 



Pr 



n^,u^ = '- 



- Pr 



(^E.\X, = r + \ 



/=i 



ind J, = {iJij2. •••, J,)-- 1 ^ ii < Ji <•••< J, < r)- Letting G = {j eJ,: min {XkiC")] = r}, 

— 1 ^ /^ / 






nE,i,\X, = r 



>ecause the two probabilities in the first expression for S, — S^^ are equal for all j € J, — G 
since the X/s are independent) and for J ^ G the second probability is zero. 



s[ - sr ' = YP'- 



J^G 



X„ ^ min [X„i(^'')], \ ^w ^ n\X^ = r 



!</<( 



= £ Pr[X ^ x(a))|A'^ = r] 



■ € 



'ol^ol . / «0 ^ ^«lJ^ll 



= 1^ 

Uere H = {j € J,: k € D,} and where fD = (Dq, D,, D2) is the union of C'^C C'. 

jj(y the definitions of ei^ and /^, the lowest order term in this polynomial expression for 
'1 — Si"*"' is fkie) *. [For notational simplicity the superscript r -\- I, r which should appear on 
,-, fk, > c, and //, will be dropped in the remainder of the proof.] Thus 



576 DA. BUTLER 

Next we show that 5; - S,'"^' = oU^'') for all / ^ 2. Let 3) be the union of any / minimal 
cuts(?'', ..., (?'' satisfying A: € D,. Then 

\Do\>\ci^\ 
and 

(2) |/)ol + |/)il> ld'l + |C|M. 

Assume that the above two inequalities simultaneously hold as equalities. Then \D(,\ = |Cn' |, 
which implies that Dq = Cq' and \D]\ = IQ' |, which implies that Z), = C\\ Thus 3) = C' 
and so x(3)) = x((?''). Now as an immediate consequence of the definition of 
3). x(3)) < xiC), and so xiC) < xiC). Furthermore, C ^ C'\ so the inequality must 
be strict in at least one coordinate. But this contradicts the assumption that the cut C' is 
minimal, and so at least one of the inequalities in (2) must be strict. 

Now the lowest power in the polynomial expression for 5" — S,'^^ is 
ao|l>ol + "il-^il ~ «r- But 

ao\Do\ + a,|Z),| - a, = (ao - a,)|Z)ol + "id^ol + l^iD " «, 

t 

> (ao-a,)|C^'| +a,(|C^'| + |C|'|)=a, 

> aolCo'l +a,|c{M -a,^ e,. f) 
Thus 

Thus, by equation (1) 

(3) I>,(k;p{€)) =h ■^"' ^o{^'^. 
Now assume that />,./:. If e/ < e^, then by equation (3) 

I„il;p(€)) - h(k,pU)) = // • «'' + o(e''). 

Thus for € sufficiently close to zero this expression is positive and so the two orderings of / and 
k are identical in this case. 

If ei = ei^ and // > fi^, then again by equation (3) 

/;,(/;M€)) - h{k;pU)) = (/, - /,)€'' + o(e''). 

Thus in this case, also, the two orderings are consistent for e sufficiently small. This establishes 
the theorem for all the r + \, r orderings. To establish the result for any r, s ordering note 
that 

/;-(/c;M6))= I/r'"(/c;M€)). 

u=s 

(See Proposition 3, part (i).) Combining this result with equation (3), 



IMPORTANCE RANKING FOR COMPONENTS 



577 



The remainder of the proof is identical to the r + 1, r case. [Note. For ternary systems, the 
only r + \, r orderings are the 2-1 ordering and the 1-0 ordering. The only other r, s ordering 
is the 2-0 ordering. The notation in the proof and the definition of r, s cut-importance has 
been kept more general so that the extensions to general rt-state systems can be more readily 
understood.] 

n 

EXAMPLE 2: In the following diagram, the states are shown in a lattice arrangement 
according to the less-than-or-equal-to relation. 

0(2,2) =2 



0(2,1) =2 



0(1,2) =2 



0(2,0) =2 



0(1,1) = 1 



0(0,2) = 1 



0(1,0) = 1 



0(0,1) =0 



0(0,0) =0 

Consider the case where An=2. ao = 2,ai = l. 
Vlin cuts: (?' = (£, (1,2), E) (£" denotes the empty set.) 
e2=({l}, E{2}) 



,2,1 _ 



= l;e2^' =1 



//•' =l;/2^' =1 



2,1 _ 



1,0 



= 0;e'" = -f 



yl,0 = J. yl.O = 
e2.o = 0; el" = 1 



Components 1, 2 are not comparable under the 2, 1 ranking. 



1 >J0 2 



-2,0 _ 



//■" = i;/2^ 



2,0 _ 



1 > 



2,0 



hiP) = I - ipio + Pl\)(P20 + P2\) - P\0 + PwiPio + Pl\) 
= 1 - PwiPlO + P2\) - P\0- 

ifHUpie)) =€' + e. I,}'(2-p{€)) =6. 

I h}\\-pU)) = \ - e - €\ I,:H2-pU))=0. 

h'H\;pie)) = \. /;20(2;p(e)) = e. 

■bus component 1 is more important than component 2 in an overall sense (i.e., according to 
le 2, cut-importance ranking). Moreover, the 2, 1 and 1, rankings of the components 



578 DA BUTLER 

show that it is the state 1 to state transition of the components which determines the 2, 
ranking here. 

As was the case for binary systems, analogous results based upon minimal paths can be 
developed for ternary systems composed of very unreliable components. 

4. CONCLUSIONS 

Reliability engineers usually know or can calculate the minimal cuts of the systems with 
which they deal. However, the component reliabilities, though usually thought to be fairly 
high, are often not known with any degree of precision. The cut-importance rankings 
developed in this paper are structural rankings, i.e., depend only upon the system structure 
through the minimal cuts. Also they are defined so as to relate closely to the Birnbaum reliabil- 
ity importance measure when the component reliabilities are high. Thus they provide reliability 
analysts and engineers with ways to meaningfully compare the relative importance to the system 
reliability of the various components of the system. 

REFERENCES 

[1] Barlow, R.E., "Coherent Systems with Multi-State Components," University of California 
Operations Research Center Technical Report ORC 77-5, Berkeley, California (January 
1977). 

[2] Barlow, R.E. and F. Proschan, Statistical Theory of Reliability and Life Testing: Probability 
Models (Holt, Rinehart, and Winston, 1975). 

[3] Barlow, R.E. and F. Proschan, "Importance of System Components and Fault Tree 
Events," Stochastic Processes and Their Applications, Vol. 3, pp. 153-172 (1975). 

[4] Birnbaum, Z.W., "On the Importance of DiflFerent Components in a Multi-Component Sys- 
tem," in Multivariate Analysis— II, PR- Krishnaiah (ed.) (Academic Press, New York, 
1969). 

[5] Butler, D.A., "An Importance Ranking for System Components Based upon Cuts," Opera- 
tions Research, Vol. 25, No. 5, pp. 874-879 (1977). 

[6] Butler, D.A., "A Complete Importance Ranking for Components of Binary Coherent Sys- 
tems, with Extensions to Multi-State Systems," Technical Report No. 183, Department 
of Operations Research, Stanford University, Stanford, CA (1977). 

[7] El-Meweihi, E., F. Proschan and J. Sethuraman, "Multi-State Coherent Systems," Florida 
State University Statistics Report M434 (October 1977). 

[8] Feller, W., An Introduction to Probability Theory and its Applications, Vol. I, 3rd ed., pp. 98- 
101 (John Wiley and Sons, New York, 1968). 

[9] Hatoyama, Y., "Fundamental Concepts for Reliability Analysis of Three-State Systems," 
unpublished manuscript. Department of Operations Research, Stanford University 
(1976). 
[10] Lambert, H.E., "Measures of Importance of Events and Cut-Sets in Fault Trees," 

Lawrence Livermore Laboratory UCRL-75853 (October 1974). 
[11] Murchland, J.D., "Fundamental Concepts and Relations for Reliability Analysis of Multi- 
State Systems," Reliability and Fault Tree Analysis, Society for Industrial and Applied 
Mathematics (1975). 
[12] Postelnicu, V., "Nondichotomic Multi-Component Structures," Bulletin Mathematique de 
la Societe des Sciences Mathematiques de la Republique Socialiste de Roumanie, Vol. 
14, (62), No 2, pp. 209-217, (1970). 



A DIFFUSION MODEL FOR THE CONTROL OF 
A MULTIPURPOSE RESERVOIR SYSTEM 

Dror Zuckerman* 

Department of Operations Research 

College of Engineering 

Cornell University 

Ithaca, New York 

ABSTRACT 

This paper develops a methodology for optimizing operation of a multipur- 
pose reservoir with a finite capacity V. The input of water into the reservoir is 
a Wiener process with positive drift. There are n purposes for which water is 
demanded. Water may be released from the reservoir at any rate, and the 
release rate can be increased or decreased instantaneously with zero cost. In 
addition to the reservoir, a supplementary source of water can supply an unlim- 
ited amount of water demanded during any period of time. There is a cost of 
C, dollars per unit of demand supplied by the supplementary source to the /' 

purpose (/ = 1, 2 n). At any time, the demand rate /?, associated with 

the / purpose (/ = 1. 2 n) musi be supplied. A controller must continu- 
ally decide the amount of water to be supplied by the reservoir for each pur- 
pose, while the remaining demand will be supplied through the supplementary 
source with the appropriate costs. We consider the problem of specifying an 
output policy which minimizes the long run average cost per unit time. 



. INTRODUCTION AND FORMULATION 

Complex systems of reservoirs today are used to produce supplies of water for agriculture, 
idustry and urban use. In addition, the production of hydroelectric power is usually a major 
bjective of water resource systems. 

An excellent account of the theory of storage systems, describing results obtained up to 
?64 is contained in Prabhu's paper [9]. Considerable progress has since been made in several 
rections, but most of the models are descriptive, rather than control-oriented. Dynamic pro- 
amming models for the optimal control of multipurpose reservoir systems have been pro- 
)sed by Hall, Butcher and Esogbue [4], Russell [10] and many others. Most of the models 
volved discrete time analysis. Meanwhile other authors, notably Bather [1], Faddy [2], [3], 
aslett [5] and Pliska [8], have developed diffusion models for the control of a dam with finite 
servoir capacity, where the optimality was defined in terms of a cost (or a utility) structure 
iposed on the operation of the system. The main purpose of this article is to provide an addi- 
3nal insight into the nature of the optimal controls. 



low at The Hebrew University of Jerusalem. 

lis research was supported in part by the National Science Foundation under Grant MBS 73-04437. 

579 



580 D. ZUCKERMAN 



In the present study we develop a methodology for optimizing operation of a multipurpose 
reservoir with a finite capacity P^ described by the following model: The input of water into the 
reservoir is determined by a Wiener process with positive drift fi and variance o-^. There are n 
purposes for which water is demanded. Water may be released from the reservoir at any rate 
/?, (R ^0). Let R, be the demand in units of water per unit time associated with the /'^ pur- 
pose (/ = 1, 2, . . . , n). At any time the release rate may be increased or decreased with zero 
cost, any such changes taking effect instantaneously. In addition to the reservoir, a supplemen- 
tary source of water can supply an unlimited amount of water demanded during any period of 
time. There is a cost of C, dollars per unit of demand provided by the supplementary source to 
the /"^ purpose, (/ = 1, 2, . . . , n). We assume without loss of generality that Cj ^ C2 ^ . 
^ C„. At any time, the demand rate R, associated with the i^^ purpose (/ = 1, 2, . . . , n) must 
be supplied. A controller must continually decide the amount of water to be supplied by the 
reservoir for each purpose, while the remaining demand will be supplied through the supple- 
mentary source with the appropriate costs. We consider the problem of specifying an output 
policy which minimizes the long run average cost per unit time. An example will be presented 
to illustrate computational procedures. 

2. THE MODEL 

Let us denote by X(t) the input into the dam during the time interval (0,t]; as indicated 
earlier [X(t);t ^ 0} is a Wiener process. By an appropriate choice of units, we may assume 
without loss of generality that fi = \, a-^ = 2. 

Note that negative values of the storage level (as in [1]) have to be taken into account.! 
This representation is relatively crude, but a solution to the problem of optimal control is still ■ 
useful, since control is needed only when the storage level is positive and we necessarily have 
for non-positive values of the storage level process that the demand associated with the n pur- 
poses will be supplied totally through the supplementary source, under any permissible output 
policy. 

If we assume, as in Pliska's paper [8] that is a reflecting boundary, then the expected 
time that the dam is empty (dry) over any given period is 0, independently of the output policy 
which is employed. The above situation seems to be unrealistic for most reservoir models. 
Furthermore, the multipurpose reservoir model which is considered by us becomes meaning- 
less, since the optimal policy in this case is to supply the demand associated with the n purposes 
totally through the reservoir and the resulting average cost associated with the above policy will be 
zero. In view of this, the Bather model in our case seems to be more appropriate. 

It will be very helpful for us to restrict the state space of the storage level process to a 
finite interval (in order to apply some results obtained by Mandl [6] and Pliska [7]). Therefore 
we make the following modification: Assume that the storage level is bounded from below by 
— 1, where —1 is an elementary return boundary. That is, when the trajectory of the storage 
level process reaches the boundary —1 it remains at —1 for a random amount of time which 
possesses the exponential distribution with mean 1, and after the termination of the sojourn 
time on the boundary —1, the process jumps into position with probability L Clearly, the 
expected transition time from —1 to is the same as under the original process. Thus, since 
our goal is to minimize the total long run average cost per unit time, the above modification of 
the storage level process does not aff'ect the decision problem (it is just a mathematical tool). 

Now let us consider the set of admissible output policies. In selecting the output policy 
only the current state, that is level of water in the reservoir, is important. The particular time 



DIFFUSION MODEL FOR CONTROL 



581 



s irrelevant since the input process is time homogeneous and since we are concerned with an 
nfinite future. Thus we consider only policies T such that 



kfhere 



r; [-1, V]-- S 



S = {(a,, a2, .... a„) |0 < a, < 1, for / = 1.2, 



n words, r(x) = (FiCx), Viix), ... , r„(x)) means that under an output policy T, when the 
torage level is x, 100r,(x) percent of the demand rate associated with the /'^ purpose will be 
applied by the reservoir, and 100(1 — r,(jf)) percent of it will be provided by the supplemen- 
ary source, for / = 1, 2, .... n. 

The set M of admissible controls consists of all piecewise continuous functions r() on 
- 1 , V\ with range in S. 

Let {Zp; r € M] be the controlled process, corresponding to the storage level in the dam 
/hen a policy T € Mis employed. 

The controlled process {Zj-; Y € M] is a diffusion process whose state space is the inter- 
al (—1, V), with drift parameter 



1) 



6r(x) = 1 - £r,(x)/?, 



nd diffusion parameter (see Mandl [6], p. 12) 



■2) 



(T 



a,{x) = ^ = \. 



Ve assume that V'\s & reflecting boundary. 

The cost arising from continuous movement of the controlled process is given by a 

n 

ounded piecewise continuous function C^ix) = ^C,/?, (1 — r,(x)) defined on [—1, V]. If 

/=i 
le trajectory is in position x at time t, then there arises a cost of the magnitude Cr(x)A/ + 
(A/) in the time interval (/, t + Ar). 



We want to find a control V* ^ M which minimizes the long run average cost per unit 



me. 



THE OPTIMAL POLICY 

In this section we show that the optimal output policy has the following form 



i) 



r(z) = 



(0.0.0, . 
(1.0.0. . 
(1,1.0, . 
(1,1,1,0, 



0,0) 

0,0) 

0,0) 

. , 0) 



if z = 

if < z < -y, 

if -y, < z ^ 72 

if 72 < z ^ -y3 



(1,1,1, ... , 1,1) ifz > y„_, 



582 D. ZUCKERMAN 

where 

^ y, < 72 ^ ••• < y.-i < V- 
(Recall that Ci ^ C2 ^ . . . ^ C„ by assumption). 

In the special case in which C, = C2 = . . . = C„, as one might anticipate, the optimal 
control would specify the maximum discharge rate at all positive levels, i.e., y* = y* = ... = 

y*n-i = 0. 

First consider the subclass of policies L <Z M under which, at any given state, the 
demand associated with the /"^ purpose (/ = 1, 2, ... ,«), will be satisfied totally by the reser- 
voir, or totally by the supplementary source. It can easily be seen that L is the collection of all 
piecewise constant functions r() on [—1, V] with range in S C 5, where 

S = {{a\, aj, ■ ■ ■ ,a„) | a, = or 1, for / = 1, 2, ... ,«}. 

Thus under an output policy F € L, the set (0, V] can be decomposed into a finite number of 
intervals, on each of which F is a constant. That is, for each output policy F € L, there exists 
a real sequence {y^K'Lo such that 

= 70 < 71 < 72 < • • • < T/, = ^' 

where F() is constant over each interval (7,-1, 7,) for i = \, 2, . . . , p. For each policy 
F € L, we define the following sets: 

(4) A^Y) = [j\\ ^ j ^ n. F/x) = 1 for x € (7,-1, 7,)}. (1 < ' < p) ■ 

The theory of optimal control in diffusion processes (Theorem 5 of Mandl [6], p. 168) implies 
that under a given policy F € A/, the long run average cost per unit time, 0r, is the unique 
number to which there exists a continuous function H'p() on [— 1,K] such that 

^^^ ^^, ^ + by{z) wr(z) + Cr(z) - 4>v = ^ k 

dz P 

holds for every z € (—1, V) which is a point of continuity of F, and that 

^^^ wr(0) = «^r - L RjCj. 

7=1 

(7) Wr(F)=0. 

Let 

a,(F) = 1 - X Rj 

7e'4,(r) 



and 



^,(F) = £ R,Cj. 

J i A,(T) 

The general solution of the differential equation (5), assuming a,(F) ^^ for a policy F € L is 
given by 

0r-Cr(z) , e'"'''' . 0r - ^,(0 , ^"''^'^ . 

(8) "^^^^ = bAz) ^ "M^ '• = a,(F) -^ ^7n" ^" 

for z € (7,-1, 7,), (/ = 1, 2, ... , p), where d, are arbitrary constants. The solution for the 
case a,(F) = can be obtained by considering the limiting behaviour of (8) as a,(r) 
approaches 0. 



I -1 



DIFFUSION MODEL FOR CONTROL 583 

Recalling that wpC) is a continuous function over [— 1,K], the constants d, must be 
hosen to assure continuity of WpC) at the points y, (/ = 1, 2, . . . , p—l). Thus we obtain the 
Dllowing equations: 

9) a,(r) a,(r) ' a, + ,(r) a,+,(r) '+' 

jr / = 1, 2, . . . , p - 1. The optimal control r*will be determined with the aid of the follow- 
ig Theorem. 

THEOREM 1: A control r*is optimal if and only if 



10) 

r;(x) = 



/u>o) if C, + wr.(x) ^ 
if C, + wr.(x) < 



)r / = 1, 2, . . . , n, and for every x which is a continuity point of F* where wp. is the solu- 
on of the differential equation (5) when policy r*is employed, and /(£> is the indicator func- 
on of the event E. 

PROOF: Let 

9y-^(x) = b^ix) wr(x) + Q(x), for F € M and t/» € M 
ccording to Theorem 6 of Mandl ([6], p. 168), F*is optimal if and only if 
.1) Or-r-ix) = min {flp-^Cx)} 

ir every x € (0, V] which is a continuity point of F *. (Note that o^,(x) = 1 for every position 
€ (0, V], under any output policy i// € A/). For a given policy t)j € M we have 

er',^(x) = (1 - il /?,•/*, (x)) H-r-U) + 2^ (1 - ip,(x)) C,R, 

i=\ 1=1 

f ^ = wr^ix) + £ C;R, - £ R,ilj,{x) [w^.U) + C,]. 

then clearly follows that 0p.p.(x) = min {0r«j,(x)) for every x € (0, K] which is a con- 

[luity point of F* if and only if (10) holds for / = 1, 2, . . . , n. This concludes the proof. 

D 

S Generally, there is no guarantee that an admissible optimal control will exist. However, 
i our case, it follows from Theorem 1 that if an optimal output policy F * € A/ exists, then 
]' € L. But the existence of an optimal control in the subclass L of policies follows directly 
hm Theorem 4.1 of Pliska [7]. Thus there exists an optimal control in M We proceed with 
\i following proposition. 

PROPOSITION 1: Let F * be the optimal output policy, then wp. is nondecreasing over 
t; interval [0,V]. 

J PROOF: We will summarize briefly the main steps of the proof. Clearly 
<;■• — ^ R,Ci ^0. Using equations (6) and (7), we obtain the following inequality 

'}) H'F.(O) ^ Wp. iV) =0. 



il 



584 



D. ZUCKERMAN 



From (13) it can be seen by elementary analysis that if Wp. is not nondecreasing, then there 
exist two points x and y in the open interval (0, V), which are continuity points of F* such 
that 



(14) 
and 
(15a) 

(15b) 






dz 
dwx"{z) 



dz 



< 0, 



> 0. 



Using equations (10) and (14) we obtain that F*(x) = F *(>'). Hence, 

(16) by,{x) = by^iy) 
and 

(17) Cr.(x) = Cy^iy). 
From (5) we obtain 



(18) 



dw^'iz) 



dz 



+ 6r.(x)H'r«(x) - </)r. + C^'{x) = 



dw^'iz) 



dz 



\z=y 



+ bi-'iy)wy'(y) - <!>[•' + Cy-iy), 
now substituting equations (14), (16) and (17) into (18) we have 



f 



dWr'iz) 



dz 



dwi-'iz) 



dz 



which is a contradiction to (15a) and (15b). Hence wp. is nondecreasing. ■ 

D 

Recalling that C^ ^ Cj ^ ■■ ^ C„, and by using proposition 1 and Theorem 1, it fol- 
lows that if F* ix) = 1 for a given y, then F*(z) = 1 for z such that x ^ z ^ Kand for each / 
such that 1 ^ / ^ J. Now in order to establish that the optimal output policy has the form 
given in (3), we still have to show that r*(x) = 1 for every positive x. But this is a direct 
consequence of the following Proposition. 

PROPOSITION 2: The optimal output policy F* satisfies the following condition 

wr«(0) + Ci ^ 0. 

PROOF: The proof will be done by contradiction. Suppose that 
(19) Wr.(O) + C'l = - 8 < 

Since Wp. is continuous on [0,V] it follows that there exists e(8) > such that H'p.(>') 
+ C, < for < >' =^ 6(8). 

Now recalling that Wp. (•) satisfies (10), it follows that F,*(.y) = for < j ^ e (8) and 
for / = 1, 2, .... /2. Using equations (6) and (8) we have 

wr-iy) = </>r- - L C,R, rorO ^ y^ e(S). 

(=1 



DIFFUSION MODEL FOR CONTROL 585 



nee wp.(O) = wp* (e(8)), we can repeat the same argument over the intervals [e(8), 
e(8)] ... and therefore T Xy) = for / = 1, 2, . . . , « and for every position j^ € [0,V]. 
lus r*is the trivial policy that keeps the output rate constantly at zero. But (7) must hold, so 



= £ C,R, and w^-iO) = 0. But C, > 0, so 



(=1 
0) 



Wr.(O) + C, = 0r- - Z ^'^' + C, > 0, 



lich is a contradiction to (19). Therefore 

>vr.(0) + C, ^ 
required. 



D 



This concludes the proof that the optimal output policy is of the form given in (3), as 
sired. 

In the following example, we will illustrate how to determine the optimal control values 

',72* y:-i. 

EXAMPLE: Let us consider the following case: We have a finite dam with capacity of 
= 100. There are two types of demand for water, where 

/?, = 0.9, /?2 = 0.2, 

C, = KC, (K ^ 1), C2 = C. 

st note that for a given policy F € £ which has the form of (3), wp is given by (see (8)) 
1) 



Wr = 



10,^ - 2C + lOe-o-V. ,^^ ^ ^ ^ ^ ^^ 
-10^ - 10^0 '^^2 for y, < z < V = 100. 



ing (6), (7) and (9) we obtain the following equations 

1. lO0r - 2C + lOd^ = 0r - Ci0.9K + 0.2) 

2. -lO0r- lOf'o^s = 

3. lOc^p - 2C + 10e"° '"''^i = - 1O0P - 10/''''^2- 

From the above 3 equations we obtain that the long run average cost associated with an 
):put policy F, which has the form given in (3) will be 

;:) _ 2-0.96-'(2-/:) ^ 

20 - 108^-"' - 98-' ' 

ere 8 = e '. From the cost function introduced above it can be seen that only the ratio of 

Is K = -— is important in order to determine the optimal critical value y * 

'I Suppose that A^ = 4, then using (22) one can easily obtain that y f = 55 minimizes 0r 
iVJect to the following constraint 

^ n < V = 100, 

0r- ~ 0.1 C 



586 D. ZUCKERMAN 

ACKNOWLEDGMENT 

The author acknowledges helpful and illuninating discussions with Professor N.U. Prabhu. 

REFERENCES 



[1 
[2 
[3 
[4 

[5 
[6 
[7 
[8 
[9 
[10 



Bather, J., "A Diffusion Model for the Control of a Dam," Journal of Applied Probability, 

5, 55-71 (1968). 
Faddy, M.J., "Optimal Control of Finite Dams: Continuous Output Procedure," Advances 

in Applied ProbabiUty, 6, 689-710 (1974). 
Faddy, M.J., "Optimal control of finite dams: Discrete (2-stage) output procedure," Jour- 
nal of Applied Probability 77, 111-121 (1974). 
Hall, W.A., W.S. Butcher and A.M.O. Esogbue, "Optimization of the operation of a 

multi-purpose reservoir by dynamic programming," Water Resources Research 4, 471- 

477 (1968). 
Haslett, J., "The control of a multi-purpose reservoir," Advances in Applied Probability 8, 

592-609 (1976). 
Mandl, P., Analytical Treatment of One-Dimensional Markov Processes. (Springer- Verlag, 

New York 1968). 
Pliska, S.R., "Single person controlled diffusions with discounted costs," Journal of 

Optimization Theory and Application 72, 248-255 (1973). 
Pliska, S.R., "A diffusion process model for the optimal operation of a reservoir system," 

Journal of Applied Probability 12, 859-863 (1975). 
Prabhu, N.U., "Time-dependent results in storage theory," Journal of Applied Probability 

7, 1-46 (1964). 
Russel, C.B., "An optimal policy for operating a multipurpose reservoir," Operations 

Research 20, 1181-1189 (1972). 



! 



OMPUTATION TECHNIQUES FOR LARGE SCALE UNDISCOUNTED 
MARKOV DECISION PROCESSES 

Thom J. Hodgson and Gary J. Koehler 

University of Florida 
Gainesville, Florida 

ABSTRACT 

In this paper we consider computation techniques associated with the optim- 
izalion of large scale Markov decision processes. Markov decision processes 
and the successive approximation procedure of White are described. Then a 
procedure for scaling continuous time and renewal processes so that they are 
amenable to the White procedure is discussed. The effect of the scale factor 
value on the convergence rate of the procedure and insights into proper scale 
factor selection are given. 



iTRODUCTION 

r 

One of the most powerful modeling tools for the analysis of controlled probabilistic sys- 
ms is Markov decision processes. If the system can be structured as a Markov process and 
e control decisions for the system can be defined in terms of the relevant system costs and 
lerational characteristics (transition probabilities), then there exists a wealth of theory that can 
used to find the best (least cost, most profitable) set of decisions for operating the system. 
i; with many modeling techniques, real probabilistic systems, when modeled as Markov 
(ocesses, tend to have large numbers of system states. The result is that for many interesting 
:d important systems, it is necessary to consider the computational aspects associated with per- 
Irming policy optimization. 

Many types of nondiscounted Markov decision processes can be transformed to a discrete 
Ine problem. Such a procedure was explicitly used by Schweitzer [17] for Markov renewal 
pgrams and involves choosing a parameter, b, for the transformation. As noted by 
ihweitzer, the value of b influences the asymptotic convergence rate when White's iterative 
jDcedure [22] is used to solve the transformed Markov decision process. We present theoreti- 
[ insights into the determination of a b which yields the fastest asymptotic convergence. In 
lictice, one cannot easily find this optimal b, so we also present heuristic rules for choosing b. 
Umputational results appear quite promising. 

kCKGROUND 

Consider a finite state, discrete time, completely ergodic Markov process which is con- 
tklled by a decision maker. For each of the A^ states (/), at each transition of the process, the 
(cision maker chooses an action k = \, .... K,. This action results in transition probabilities 
I, y = 1, jV, and a reward (cost) qj^. p^i is defined as the probability that the process, now in 

587 



588 T.J. HODGSON & G.J. KOEHLER 






State / and under policy k will move to state j over the next transition, q^^ is defined as thi 
expected reward (cost) over the next transition. The problem is to find the gain optimal actior 
for each state. 

Howard [5] showed that for a given policy set, the simultaneous set of linear equations, 

(1) v,+g = q^+ Zp'v, , = 1 N 

v/v = 

could be solved to compute the gain ^of the process. The v/s are the relative rewards (costs! 
of starting the process in state /. Howard showed that the optimal gain could be obtained usini 
a simple policy interative algorithm. 

Consider a finite state, continuous time, completely ergodic Markov decision process. Fo 
each of the TV states (/), at each transition, the decision maker chooses an action k = \, . . . Kf 
This action results in a transition rate a,^ and a reward (cost) rate qj". a^j as defined as follows 
In an increment of time dt, the process, now in state / and under policy /c, will move to state 
with probability a,^ dt (/ 7^ j). q^ is the expected reward (cost) rate incurred over a residenc 
in state / using action k. 



Howard [5] showed that for a given policy set, the set of equations, 

^^^ g-q'+ Z«'v,, i = \, ..., N, 



i 

otBr 



could be solved to compute the gain g and a policy iterative algorithm could be used to coi 
pute the optimal gain. Note that 

fl„ = - L 0,7' ' = 1' ••• . ^• 

Finally, consider a finite state, completely ergodic semi-Markov decision process. Thi 
underlying Markov process has transition probabilities /?,';. The holding (transition) time (m) ii 
going from state / to j is described by the density function hl^iim), < m < oo. The expectei 
holding time, given the system starts in state / is 

^'= ZA'^X°°An/;,5(m) dm > 
Jewell [6] showed that for a given policy set, the set of equations, 

^^^ V, + T^g = q^+ £a!^v,, / = 1 A^ 

v^ = 

could be solved to compute the gain g and a policy iterative algorithm could be used to com 
pute the optimal gain. 

WHITE'S METHOD AND PROBLEM TRANSFORMATIONS 

The bulk of the computational effort in policy iteration lies in solving (recursively) the se 
of equations (1), (2), (3). For large processes, techniques, such as Gaussian Elimination, 



LARGE SCALE MARKOV DECISION PROCESSES 589 

iickly become untenable. White [22] proposed a successive approximation approach for the 
^discounted, discrete time, Markov decision process'. Odoni [13] added bounds for g which 
e useful in termination decisions. The White-Odoni technique can be summarized as follows: 

Assume we have computed sets of values V/in — 1), v,(rt — 1), / = 1, ... , A^ and a 
jantity ^„_i. We then compute a new set 

?'+ i/^'v/ (« -1) 



/ 



K,(rt) = max 



/ = 1 



v,(n) = V,in) - g„, 

L"(n) =max [V,(n) -w,(n - 1)) 

L'(rt) = min [V,(n) - v,(« - 1)) 

here M is a state of the process such that for all sets of policies and some integer w > 0, the 
obability of reaching state M in u transitions, starting in any state /, is nonzero for all states /. 
hite showed that the repeated application of equations (4) will converge' to a solution for 
luations (1). Odoni showed that 

L"{n) ^ L"in + \) ^ g ^ L'{n + 1) ^ L'{n). 

\ In practice. White's algorithm has proven to be very effective for large scale systems. It is 
jible, self-correcting, and, lends itself to the exploitation of any supersparsity [7]. 

i 

While straight forward application of White's approach does not, in general, work for con- 
luous time, and semi-Markov processes, these processes can be transformed to a form com- 
tible with White's approach. Consider equations (2) with v, added to both sides of the equa- 

m. 

{) y,+g = q!" + Z a!^, y, + (1 + a,f)v„ / = 1 N, 

I v^ = 

bting the definition of a,,, then if 

'» > a„ = - £ o,5 > -1, i = \. ... , N. 

iuation (5) is of the same form as equation (1). Substituting 1 -I- a,^ for o,^ in the rate matrix, 
i: new matrix [ajj] has the properties of a stochastic matrix. 

If (6) holds, it follows that White's method can be used to solve the continuous time 
hrkov decision process. The following procedure can be used to convert a continuous time 
pblem to satisfy (6). 

,j 

1 Let an,a, = max 



\a^\ 



i = \ A' 

* = l K 



'he assumptions used by White can be relaxed. Schweitzer 119] proved convergence tor the general single chain acy- 
c process while Su and Deininger |20] extended this to the periodic case. Such conditions are hard to test in practice. 
Rently Platzman Il4] has given a weaker condition that can be readily tested. Finally, Morton and Wecker [11] have 
gjiJralized most of the above plus have added some new dimensions to the algorithm. 



590 T.J. HODGSON & G.J. KOEHLER 



2. Divide ail a I", and ^, , I, j = \, .... N, k = \, ... , K, hy b > a^iax- Condition (6) i: 
now satisfied. 

3. Using the new a^ and qj', solve the problem using White's method. 

4. To express the results in terms of the original process, multiply the gain g by b. Th{ 
optimal policy and relative rewards (costs), v,, obtained are valid for the orginal process, 



I 



Note that the scaling really amounts to changing the time frame of the problem. 
Consider the reorganization of equations (3). 



(7) q!" , ^ a'v, , (p!:-\) 



7'k '•^ "rk 'rk ' 



Letting 

a'^i = P^JT'^. and 

it is readily seen that equations (7) are of the same form as equations (2). As a consequent 
the transformation can also be applied to semi-Markov decision processes (the transformation i 
equivalent to Schweitzer's [17]). Note that a Markov process is itself a degenerate semi 
Markov process subject to transformation. One would consider such a transformation if thi 
convergence properties of White's method could be improved. We now address this issue. L 

CONVERGENCE FACILITATION 

There are several procedures that have been used in accelerating convergence in solvini 
discounted Markov decision processes. By and, large, though, these have not been examinei 
extensively in the non-discounted Markov decision process context. Briefly, the acceleratioi 
techniques include problem transformation, [17] cheap iterations [10, 23], suboptimal activit 
elimination [1, 2, 3, 8, 9, 15, 16, 21] and extrapolation procedures [23]. We will limit our dis 
cussion here to problem transformation. -I 



In solving (generalized) discounted Markov decision processes, it is well known that thi 
largest spectral radius of the transition matrices (i.e., the process spectral radius) governs tb 
asymptotic convergence rate. Porteus [15], Totten [21] and others have devised problei 
transformations to reduce the process spectral radius. Morton and Wecker [11] have show: 
that asymptotic relative values and policy convergence are at least of order (a\)" where X i 
greater than the subdominant eigenvalue^ and < a < oo is the discount factor. A reasonabi 
question to ask is whether the choice of b in Step 2 can be made to reduce the modulus of th 
subdominant eigenvalue of the transition matrix of the optimal policy. 



} 



The largest eigenvalue is always 1.0. The subdominant eigenvalue is the remaining eigenvalue having the large' 
modulus. 



LARGE SCALE MARKOV DECISION PROCESSES 



591 



The transition matrix for policy 8 resulting from the scaling procedure is 

I + — A^, where 
b 

A, = P,- I. 

jt \ and 3c be an eigenvalue and associated eigenvector, respectively, of the starting transition 

atrix / H Af,. Then 



\ + 



b — a, 



an eigenvalue of I + — A^ with x its associated eigenvector. Now clearly 
b 



re\ + 



b — a. 



> reX 



lere reX is the real part of X with -1 < reX < 1 and b > flmax ^ 0- However, it may not be 
ae that 



X + 



b - flmax 



^|A|. 



ppose 8 indexes an optimal policy and X is a subdominant eigenvalue associated with this pol- 
'. Expanding the square of the modulus of both sides of (9) with X = Xj + X2 / gives that a 
iuction in the modulus of X requires 



(1 -X,) 



X, + 



b — Qr 



+ b 



<xl 



kj = 0, then either Xi = 1 and no reduction can be made or X] < (omax~*)/(^max + *) and 
' is necessarily negative. In this case, it would appear that any b > 0^3^ will yield a resultant 
:nefit in asymptotic convergence. However, this is not necessarily true, since we may "bump" 
another eigenvalue. That is, increasing b to decrease the absolute value of the dominant 
sgative) eigenvalue will eventually result in some other (positive) eigenvalue increasing until 
becomes the new subdominant eigenvalue. At that point further increases in b will not im- 
:)ve the convergence rate. 



As an example, consider the Markov process whose transition matrix is given as follows: 

".31 .13 .21 .05 .10 .20 



.15 


.12 


.16 


.20 


.12 


.25 


.02 


.01 


.01 


.01 


.93 


.02 


.12 


.28 


.09 


.16 


.04 


.31 





.01 


.85 





.09 


.05 


.11 


.30 


.10 


.15 


.14 


.20 



The eigenvalues are 1.0, -.8421, .6945, .2079, -.085 +.01161, and -.085 -.01161. It 
»uld appear that problem transformation should be of value in speeding convergence, since 
tl subdominant eigenvalue is negative. From the preceding development, it would be 
£)ected that the convergence rate of the process would be maximized at the value of "b" which 
r^jUlts in the largest negative eigenvalue being equal to the largest positive eigenvalue. Apply- 
iif equation (8), to equate the two eigenvalues of the transformed matrix, we get 



592 



T.J. HODGSON & G.J. KOEHLER 



4 (.8421) - ^^ - ^ (.6945) + i^ 
b b b b 

Solving, we get b = 1.063. In other words, transforming the process using b = 1.063 should 
achieve the "best" asymptotic convergence for the process. As a test, White's Algorithm was 
run using costs of 

q = (1.14, 2.27, 5.06, 2.97, 3.96, 4.90) 

(only one policy per state). The problem was declared "solved" when L"(n) — L'{n) ^ 10~^ 
Runs were made for various values of b (see Figure 1). The actual minimum number of itera- 
tions (30) occurred for a value of Z? = 1.09, whereas the number of iterations for b = 1.063 
was slightly higher (31). The inaccuracy in prediction is expected, since the method of predic- 
tion considers only main effects and ignores the contribution of the smaller eigenvalues. ^ 



60- 



50- 



NUMBER OF 

ITERATIONS 

TO 

CONVERGENCE 



40- 



30 




Vv 



1.0 



1.1 



1.2 



Figure i 

As one might expect, the straightforward application of the above observations is not 
practical, since the determination of eigenvalues for large processes is itself difficult. However, 
in practice it is usually intuitively obvious to the analyst that a process may possess strong cyclic 
tendencies, indicating that some eigenvalue has a large negative real component. If the cyclic^ 
tendency is strong enough, this eigenvalue will be the subdominant eigenvalue and the above; 



LARGE SCALE MARKOV DECISION PROCESSES 



593 



evelopment suggests that some b > a^ax r^ay decrease the resulting asymptotic convergence 
ite. In any event, applying White's method, using several values of b marginally larger than 
i^ax, and noting the convergence rate of the process for various values of b can many times be 
|f value. 

In testing the above we noted that if b was made successively slightly larger than Omax' 
ither the convergence improved dramatically or the convergence slightly deteriorated. To 
'irther test this observation, we randomly generated Markov decision problems with the 
umber of states varying from 3 to 20. Within each state ten different actions were available, 
/bite's method was used to solve each using b values of 

bo = Omax + 10"^ 



Zj, = 1.05* 



62 = 1.10*0 

*3= 1.15*0 

gain, problems were declared "solved" at iteration n when L"{n) — L'{n) < 10^"*. If a prob- 
•m was solved in fewer iterations for some *, than bj with / > 7, then the problem transforma- 
on was declared beneficial. Otherwise the transformation was classified as non-beneficial, 
learly a problem could be mislabeled as non-beneficial using the grid given above but may in 
ict be beneficial for some * > a max- The opposite is not the case. 

' Table 1 gives the total number of iterations to solve the non-beneficial and beneficial 
roblem cases. A^„ and A^^ stand for the number of problems labelled non-beneficial and 
eneficial, respectively. If we can assume that the average performance of the set of randomly 
jnerated problems used in this study is representative of the performance of the set of real 
Grid problems, then the following observations can be made. First, problems whose conver- 
;nce can be improved by increases in * and above a max ^re those problems that are hard to 
)lve anyway (see Table 1, 19.5 versus 35.4 iterations). Second, when a problem does not 
low convergence improvement when * is increased above Ornax^ the deterioration in conver- 
jnce speed is not dramatic (see Table 1, 19.5 versus 22.8 iterations for a 15% increase in * 
30ve flmax)- Finally, convergence improvements, when they occur, are rather dramatic (see 
able 1, 35.4 to 18.5 iterations for a 15% change in * above Omax)- These observations suggest 
;iat use of problem transformation can be of significant value in speeding convergence. 

TABLE 1. Summary of Iteration Counts 





Total Iteration Counts 


*o 


^ 


b2 


^3 


46 
67 


896 

2372 


942 
1520 


997 
1310 


1047 
1241 


Average 

per 
Problem 


19.5 

35.4 


20.5 

22.7 


21.7 
19.6 


22.8 
18.5 



BIBLIOGRAPHY 



j[l] Hastings, N.A.J., "A Test for N"on-Optimal Actions in Undiscounted Finite Markov Deci- 
sion Chains," Management Science, 23, No. 1, pp. 87-92 (1976). 



594 TJ. HODGSON & G.J. KOEHLER 



[2] Hastings, N.A.J, and J.M.C. Mello, "Erratum Tests for Suboptimal Actions in Discounted 

Markov Programming," Management Science, 20, No. 17, p. 1143 (1974). 
[3] Hastings, N.A.J, and J.M.C. Mello, "Tests for Suboptimal Actions in Discounted Markov 

Programming," Management Science, 79, No. 9, pp. 1019-1022 (1973). 
[4] Hordijk, A. and H. Tijms, "The Method of Successive Approximations and Markovian 

Decision Problems," Operations Research, 22, pp. 519-521 (1974). 
[5] Howard, R.A., Dynamic Programming and Markov Processes (MIT Press and Wiley, New 

York, 1960). 
[6] Jewell, W.S., "Markov Renewal Programming: I and II," Operations Research Society oi\ 

America, Jl, pp. 938-971 (Nov. -Dec, 1963). 
[7] Kalan, J.E., "Aspects of Large-Scale, In-Core Linear Programming," Proceedings of ACM 

Annual Conference, Chicago, Illinois (August 3-5, 1971). 
[8] MacQueen, J., "A Modified Dynamic Programming Method for Markovian Decision Prob- 
lems," Journal of Mathematical Analysis and Applications, J4, pp. 38-43 (1966). 
[9] MacQueen, J.B., "A Test for Suboptimal Actions in Markovian Decision Problems," 
Operations Research, 15, pp. 559-561 (1967). 

[10] Morton, T.E., "On the Asymptotic Convergence Rate of Cost Differences for Markovian 
Decision Processes," Operations Research, 79, pp. 244-248 (1971). 

[11] Morton, T.E. and W.E. Wecker, "Discounting, Ergodicity, and Convergence for Markov 
Decision Processes," Management Science, 23, pp. 890-900 (1977). 

[12] Nering, E.D., Linear Algebra and Matrix Theory, (2nd. Ed., John Wiley and Sons, New 
York, 1970). 

[13] Odoni, A.R., "On Finding the Maximal Gain for Markov Decision Processes," Operations 
Research, 77, pp. 857-860 (1969). 

[14] Platzman, L., "Improved Conditions for Convergence in Undiscounted Markov Renewal 
Programming," Operations Research, 25, No. 3, pp. 529-533 (1977). 

[15] Porteus, E.L., "Bounds and Transformations for Discounted Finite Markov Decision 
Chains," Operations Research, 23, No. 4, pp. 761-784 (1975). 

[16] Porteus, E.L., "Some Bounds for Discounted Sequential Decision Processes," Management 
Science, 18, No. 1, pp. 7-11 (1971). 

[17] Schweitzer, P.J., "Iterative Solution of the Functional Equations of Undiscounted Markov 
Renewal Programming," Journal of Mathematical Analysis and Applications, 34, pp. 
495-501 (1971). 

[18] Schweitzer, P. J., "Multiple Policy Improvements in Undiscounted Markov Renewal Pro- 
grams," Operations Research Society of America, 79, pp. 784-793 (May-June, 1971). 

[19] Schweitzer, P.J., "Perturbation Theory and Markovian Decision Processes," MIT Opera- 
tions Research Technical Report, 75, (June 1965). 

[20] Su, S.Y. and R.A. Deininger, "Generalization of White's Method of Successive Approxi- 
mations to Periodic Markovian Decision Processes," Operations Research, 20, No. 2, 
pp. 318-326 (1972). 

[21] Totten, J.C., "Computational Methods for Finite State Finite Valued Markovian Decision 
Problems," Operations Research Center, University of California, Berkeley, ORC-71 
(1971). 

[22] White, D.J., "Dynamic Programming, Markov Chains, and the Method of Successive 
Approximations," Journal of Mathematical Analysis and Applications, 6, pp. 373-376 
(1963). 

[23] Zaldivar, M. and T.J. Hodgson, "Rapid Convergence Techniques for Markov Decision 
Processes," Decision Sciences, 6, pp. 14-24 (1975). 



AN ALGORITHM (GIPC2) FOR 

SOLVING INTEGER PROGRAMMING PROBLEMS WITH 

SEPARABLE NONLINEAR OBJECTIVE FUNCTIONS 

Claude Dennis Pegden 

The Pennsylvania State University 
University Park, Pennsylvania 

Clifford C. Petersen 

Purdue University 
W. Lafayette, Indiana 

ABSTRACT 

This paper presents an algorithm for solving the integer programming prob- 
lem possessing a separable nonlinear objective function subject to linear con- 
straints. The method is based on a generalization of the Balas implicit 
enumeration scheme. Computational experience is given for a set of seventeen 
linear and seventeen nonlinear test problems. The results indicate that the al- 
gorithm can solve the nonlinear integer programming problem in roughly the 
equivalent time required to solve the linear integer programming problem of 
similar size with existing algorithms. Although the algorithm is specifically 
designed to solve the nonlinear problem, the results indicate that the algorithm 
compares favorably with the Branch and Bound algorithm in the solution of 
linear integer programming problems. 

INTRODUCTION 

This paper presents an algorithm for solving the following nonlinear pure integer pro- 
jamming problem. 

NNS NS 

Max g{x) = Y. fi^^i^ + Z ^/•^/■ 
P S.t. Ax $ b 

1 

i/here: c. A, and b denote the usual constant arrays 

/"*" denotes the set of all nonnegative integers 

^ denotes constraints of the less-than-or equal type and greater-than-or-equal type 
fj{x) is a single variable nonlinear function with /,(0) = 
The region defined by Ax "^ b, x^ I* is bounded and nonempty 
A^A^S denotes the number of nonlinear stages 
NS denotes the total number of stages 

There are several transformations which are useful to convert problems to the required 
;brm. If the problem contains k equality constraints of the form a,x = 6,, we can replace this 
3t by a set of /c + 1 inequalities of the form a^x ^ b, for i = \, . . . , k and 

595 



596 CD. PEGDEN & C.C. PETERSEN 



k 

(=1 



k 

X ^ ^ b,. If the problem contains one or more nonlinear functions fjixj) such that 

(=1 



fj(0) ^ 0, we can replace each by a new function fjixj) = fjixj) — fjiO). If the nonlinear por- 
tion of the objective function cannot be separated into functions of a single variable, but the 
nonseparable portion can be separated into k functions of linear integer combinations of the 
variables, we can convert the problem to the required form by replacing each of the k linear 
combinations in the objective function by a dummy variable d,^. The dummy variables are 
forced to assume the appropriate values by appending, for each k, the constraint that d^ equals 
the /c"^ linear combination. To illustrate, consider the following example: 

Max (x, + 3x2)^ - 9x2^ 

(2) 

^^ S.i. 2xi + X2 ^ 5 

Xi and X2€/^ 

To convert the problem to the desired form we must express the objective function as the sum 
of nonlinear functions of a single variable. We accomplish this by replacing the linear combina- 
tion Xi + 3x2 in the objective function by the dummy variable d^ and append the constraint 
that di = Xi + 3x2, yielding the following equivalent problem: 

Max d^ - 9xj 

S.t. 2x, + X2 < 5 

d\ — X\ — 3x2 = 

d\, X| Xi^ I* 

If the objective function contains a product term of two variables, we can employ the device of 
completing the square to transform the problem to the desired form. To illustrate, consider the 
following nonlinear integer programming problem: 

Max X]^ + 6x1X2 

(4) 

^ ' S.t. 2x, + X2 < 5 

Xx, X26/+ 

At first glance, because of the product term 6x1X2, the objective function appears to be non- 
separable. However, by adding and subtracting 9x2^ to the objective function we complete the 
square, and by factoring the objective function becomes: 

(x, + 3x2)^ - 9X2^ 

The problem is now identical to the previous example and is convertible to the desired form by 
the introduction of a dummy variable. 

The problem given by (1) is very difficult to solve with existing methods. If the problem 
contains only one or two constraints, it may be amenable to solution by dynamic programming. 
If all fj are nondecreasing functions and c and A are nonnegative, then the imbedded state 
space approach presented by Morin and Marsten [7,8] may be employed to help mitigate the 
"curse of dimensionality" normally encountered in problems having several constraints. Also, if 
the problem is of very small size and can be converted (using the binary expansion) to a zero- 
one polynomial problem [11], a solution may be obtainable using either the transformation of 
Watters [13], or a zero-one polynomial algorithm such as that given by Taha [12]. However, 
many nonlinear integer programming problems of both practical and theoretical significance fall 



ALGORITHM FOR INTEGER PROGRAMMING 597 

ito neither class and are therefore essentially unsolvable by methods other than the GIPC2 
Igorithm presented here. 

i 

I. OVERVIEW OF THE ALGORITHM 

I The GIPC2 algorithm is based upon the notion that although the solution space of an 
iteger programming problem may be large, it is finite. The general approach of the algorithm 
; to implicitly enumerate, by means of a fathoming test, a set of candidate solutions to the 
roblem. The set of candidate solutions is defined in such a way that it necessarily contains the 
ptimai solution to the problem. The general phases of the algorithm are as follows: 

I. Find a good feasible solution to the problem. 

II. Determine a vector of upper and lower bounds on x 

III. Generate a set of candidate solutions to the problem. This set should be as small as 
possible, while necessarily containing the optimal solution. 

IV. Implicitly search the set of candidate solutions for the optimal solution to the prob- 
lem. 

Note that in developing an implicit enumeration algorithm for the zero-one integer pro- 
ramming problem, the special structure of the problem can be exploited to eliminate Phases II 
nd III. The set of candidate solutions can be simply defined as the set of vectors produced by 

II combinations of assignments of zero and one to each variable in the problem. The major 
isk of the Balas algorithm [1] consists of essentially Phase IV; implicit enumeration of candi- 
late solutions. However in the nonlinear integer programming problem given by (1), our task 
'> more difficult. If we define the set of candidate solutions as simply all combinations of feasi- 
le integer values assigned to each variable, the number of candidate solutions can become so 
irge, for some problems, as to make the approach computationally intractable. The key to the 
uccess of the GIPC2 algorithm, therefore, is the ability of the procedure to limit the set of 
andidate solutions to a manageable size, while guaranteeing that the optimal solutions is con- 
iined within the set. 

The steps of the algorithm require the solution of several linear programming problems 
or obtaining bounds on the optimum nonlinear solution. Our approach to solving the non- 
near problem consists essentially of substituting one of three different linear approximating 
unctions for the nonlinear objective function at each step in the algorithm where a linear pro- 
ramming solution is required. The three linear approximating functions are defined as follows: 

CqX : a "good" linear approximating function to the nonlinear objective function. 

The linear function c^xdoes not necessarily boand the nonlinear function 
above or below. 

C/X -I- a/: a "good" lower bounding linear approximating function to the nonlinear 

objective function. For all x in the domain, 

C/X + a, < ^(x) 

c„x + a„ : a "good" upper bounding linear approximating function to the nonlinear 
objective function. For all x in the domain, 

c„x + a„ ^ g{x) 



598 CD. PEGDEN & C.C. PETERSEN 



All the linear programming solutions in the algorithm are used exclusively to obtain either a 
feasible solution or an upper or lower bound on the optimal solution. By appropriately selecting 
our linear approximating function for each linear programming problem we set bounds that nar- 
row the range of search. Since the feasible region is not altered, the algorithm guarantees an 
exact solution to the nonlinear integer programming problem. 

A number of excellent methods exist for computing a linear approximation to a separable 
nonlinear function. These include a least squares fit procedure and a linear programming for- 
mulation to minimize either the sum of the absolute values or the maximum deviation. 
Geoffrion [3] discusses the use of objective function approximations in mathematical program- 
ming and presents methods for determining the "best" approximation for cases of particular 
interest in mathematical programming. GIPC2 employs a less elegant approximating procedure, 
but because of its simplicity and the nature of the problem, the procedure is well suited for this , (, 
particular application. It should be noted that although the specific approximation employed j <{[; 
will not affect the accuracy of the final solution obtained by the GIPC2 algorithm, poor approxi- 
mations will have the consequence of increasing the computation time and storage require- 
ments of the algorithm. 

3. FINDING A GOOD FEASIBLE SOLUTION (PHASE I) j„ 

Our objective is to compute SMIN, a lower bound on the optimal objective function 
value, by finding a good feasible solution to the problem. We accomplish that by the following 
steps: 

1. Replace g(x) by c„x and solve the resulting linear programming problem. 

2. Force the continuous solution to a good feasible integer point, x, by successively test- 
ing each fractional variable at its rounded down and rounded up value, then fixing 
the variable at the integer point associated with the largest feasible value of the objec- 
tive function. 

3. Compute SMIN by substituting x^ into the nonlinear objective function. If step 2 
fails to yield a feasible integer solution, SMIN is set to a large negative number or 
may be specified on data input if a lower bound is known. 

4. COMPUTING UPPER AND LOWER VARIABLE BOUNDS (PHASE II) 

Once SMIN has been obtained, we next establish good upper and lower bounds on the 
variables so that the range of search may be narrowed. We do this by solving two linear pro- 
gramming problems of the form maximize x, and minimize Xj, subject to the original con- 
straints of the problem and the additional constraint that the objective function be greater than 
or equal to SMIN. For the nonlinear objective, this procedure would produce a nonlinear con- 
straint which we desire to avoid. By noting that since <:„x + a,, ^ gix), then 
c„x + a„ ^ SMIN, we will replace the nonlinear constraint ^(x) ^ SMIN with a series of 
linear constraints which conservatively approximate the single nonlinear constraint. The pro- 
cedure for computing upper and lower bounds on x with a nonlinear objective function is as fol- 
lows for each variable in the problem: 

1. Determine initial variable bounds by solving the following two linear programming 
problems: 



Max 


^1 




Min 


^1 




S.t. 


Ax ^ b 




S.t. 


Ax ^ b 






c„x ^ SMIN - 


- oc,, 




c„x > SMIN - 


- a 



ALGORITHM FOR INTEGER PROGRAMMING 599 



2. Compute c„x + a„ based on the current variable bounds. 

3. Using the new c„x + a,,, generate a new constraint c,;X ^ SMIN — a„ and append it 
to both Hnear programming problems. 

4. Solve the two linear programming problems to determine new variable bounds. If 
the variable bounds have been improved as shown by a reduction in domain, a 
stronger "cut" may be possible, so go to step 2. 

Otherwise terminate the procedure and use the current variable bounds, UB, and 
LB,. 

The procedure will obviously terminate at some point with no improvement; possibly with 
bounds that uniquely determine the value of some of all variables. Although the number of 
iterations required is problem dependent, the procedure typically converges within two or three 
iterations. 

5. GENERATING THE SET OF CANDIDATE SOLUTIONS (PHASE III) 

In Phase III we enumerate in an efficient manner solutions that yield a value equal to or 
greater than SMIN, possibly including some solutions that are infeasible. In Phase IV we will 
dentify the optimal (feasible) solution. 

i It is convenient to transform each domain LB, to f/5, found in Phase II into a domain 
iUBi — LB,). This is done by defining a new vector y as y = x — LB and replacing x in our 
original problem with LB + y. Also, if the lower bound and upper bound are equal for any 
Variable y, we can delete y, from the problem as we know its optimal value is zero (the optimal 
value of X, = LB, = UBi). Our problem now becomes: 

I Max Giy) + giLB) 

f' S.t. Ay $ b 

vhere Giy) = giy + LB) — giLB) where the bar over the constant array b denotes 
Tiodification of the original values due to the substitution y = x — LB. From Phases I and II 
ve also know: 



;6) 

vhere 



Giy) ^ SMIN 
^ y ^ UB 



SMIN the new lower bound on the objective function after the transformation 
from X to >- is SMIN = SMIN - ^(Ifi) 

UB is the vector of upper bounds on y and is equal to UB — LB. 

Our procedure for generating the set of candidate solutions to problem (5) consists of 
•inumerating all y vectors satisfying the conditions given by (6) and the constraint set Ay ^ b. 
>Iote that the optimal y vector will satisfy all the above conditions and therefore will necessarily 
»e contained within the setof candidate solutions. In order to facilitate the computations, we 
elax the condition Ay ^ b at several points in the procedure. As a result of this relaxation. 



600 



CD. PEGDEN & C.C. PETERSEN 



the set of candidate solutions which is generated may contain entries which are not feasible to 
our problem. This relaxation allows us to accomplish the enumeration of y vectors in a recur- 
sive tabular fashion akin to the procedure employed in discrete dynamic programming. How- 
ever, Bellman's "Principle of Optimality" is never invoked in the process and, therefore, the 
assumption of monotonicity is not required in the development. Because of the similarities 
between discrete dynamic programming and the recursive tabular procedure employed here for 
enumerating the y-vectors, it will aid our discussion to borrow the following dynamic program- 
ming terminology. ^ 

STAGE: a function of a single variable 

STATE: the state at stage k is the value of G{y) resulting from an assignment of 
integer values to y^. at stage k through y„ at the last stage, inclusive 

DECISION: a positive integer assignment to an element of >' i 

NDEC: the number of decisions made at a given stage-stage 

Our general procedure is to generate a table for each stage containing all potentially 
optimal states and the corresponding decisions at that stage which, in conjunction with prior 
decisions, produce that state value. By the means of certain tests, we exclude a large number 
of entries from the tables by ascertaining that they are either infeasible or nonoptimal to our 
problem. Assignments which the tests fail to exclude, and are thus contained in the tables are 
termed as candidate solutions to our problem. 

The computations begin at the last stage (n) and recursively proceed to stage 1. The 
stage A? computations are performed as a special case, with the computations for stages « — I, 
rt — 2, — k, — , 1 proceeding as the general case. Therefore details of the stage n and stage k 
computations are sufficient to fully describe the algorithmic procedure for generating the set of 
candidate solutions to the problem. 

Stage n Computations 

In the stage n computations we simply enumerate in tabular form all possible state values 
for the integer domain oi' y„. The following table is produced. 





STAGE 


n 


STATE 


NDEC 


DECISIONS 





1 





Gil) 


1 


1 


Gil) 


1 


2 




1 • 


3 



GiUB,,) 



UB„ 



ALGORITHM F^iR INTEGER PROGRAMMING 601 



Stage k Computations 



The general stage computations begin by forming and initializing two vectors, named 
rVEC and LVEC. The first records the total state value for each possible decision, and the 
second records location information relative to the previously generated stage. These vectors 
ire used simply to produce efficiently the STATE, NDEC, DECISIONS table for each state. 

The vectors TVEC and LVEC are initially dimensioned equal to the number of possible 
decisions at stage k. The d^'^ entry_ in TVEC and LVEC corresponds to the decision y,^ = d, 
jvith d initially ranging from to UBi^. However, we will show that as the state value at a given 
jtage increases, the number of possible decisions at that stage decreases. We will take advan- 
age of this property to continuously reduce the dimension of TVEC and LVEC as the compu- 
,ations for stage k proceed. 

Each entry in TVEC, corresponding to a given decision t/ assigned to yi^, is the total state 
/alue at stage k. The total state value for each decision is comprised of a fixed state contribu- 
fion at stage k combined with the /'^ total state at stage k + \, where / is given in the 
corresponding location in LVEC. Defining 5/^+i , as the /"^ state value in the stage k + \ table, 
ill possible total state values at stage k resulting from y^ = d are given by: 

1:7) t,{i. k) =gkid + LB,) - g.UB,) + S,+i,, 

I 

iVhere / is defined from 1 to the number of state values in the stage k + \ table and g/^ denotes 

he objective function for the /c'*^ stage. Note that by systematically indexing (7) over all / for 

•-ach entry in TVEC we can generate in ascending order of magnitude all possible state values 

,nd the corresponding decisions for stage k. This recursive relationship, in conjunction with 

wo exclusion tests, is the basis for generating the set of candidate solutions to the problem. 

^he purpose of the two exclusion tests is to exclude as many states and corresponding decisions 

s possible from the stage k table by discerning that they are either infeasible or nonoptimal to 

he problem. 

i 

^'xclusion Test A 

|j At each stage /c, solve the following LP. 

n 

F = min £ c,jyj + a,j 

^^ S.t. c^ ^ SMIN - a^ 

Ay ^b 

y >0 

•here C/, c^, and a/ and a^ are the constants from the previously defined linear bounding func- 
ons and the subscript J is used to denote the f^ stage. Exclude all state values for which 
/(/, k) < V, and revise the lower bound LBi^, accordingly. The optimal integer solution Y* of 
le rt- variable problem will be a point within the feasible region given in (8). It will have a 
cate value at stage k, as given by the objective of (8), when only its Y* for j = k, 
+ 1, .... n are considered. J^is a lower bound on the minimum state value at stage k con- 

idering all of the points in the feasible region. It follows that y*, J = k, k + \ n, the 

ptimal decisions at stages k through «, will yield a state value ^ Kand will not be excluded by 
iscard of all state values < V. 



602 CD. PEGDEN & C.C. PETERSEN 



Exclusion Test B 

At each stage k, solve the following two LP's, one with yi, fixed at its current lower bound 
and the other with y/^ fixed at its current upper bound. 



w = 


max 


J=k 


Z = max 




S.t. 




c^ > SMIN - a^ 
Ay $b 

yk = LB, 


S.t. 


c^ ^ SMIN - 
Ay ^b 

yk = UB, 



(9) (10) 



where: LBj, denotes the current lower bound on yj^ (initially 0) 

UBi^ denotes the current upper bound on >'^. (initially UBj^ — LBj^). 

Exclusion test B is dynamic in the sense that the bounds on y^^ are continuously tightened as 
larger and larger states values are generated by the algorithm. This tightening of bounds is 
accomplished as follows: 

(a) if the current state > W ox if the problem is infeasible replace LBi^ by LBi, + 1 and 
compute the new value of W 

(b) if the current state > Z or if the problem is infeasible replace UBj^ by UB^^ — 1 and 
compute the new value of Z 

Recall the general process of generating candidate solutions using the TVEC and LVEC vectors. 
At stage k candidate values {d) are assigned to y,, starting with the current LBi^. A state value 

n 

Z Gjiyj) will result. However, if it exceeds W, an upper bound on the maximum feasible 

state value with yi^ equal to LB,, or if there is no feasible solution to problem (9), then LB, is 
clearly not a valid assignment. The lower bound may be increased by one, to seek a feasible 
solution and/or a new increased value of W. Based on similar reasoning, the current upper 
bound UB, may be tightened, by reducing it by one, whenever the state values exceeds Z, an 
upper bound on the maximum feasible state value with y, equal to UB,, or whenever there is 
no feasible solution to problem (10). The proof that the optimal state value and corresponding 
decision y* will not be excluded follows from the fact that initially LB, ^"^y* ^ UB,. As 
larger and larger state values are generated, the bounds on y, will tighten until the upper and 
lower bounds are equal to y*. The bounds cannot be tightened to exclude y* since y* is feasi- 
ble to the constraint set given in (9) and (10) therefore the corresponding state must be less 
than or equal to Jf'and Z. 

It should be noted that although we must recompute the values of W ox Z when we 
tighten the upper or lower bound on _v^, there is no need to solve the entire LP given in (9) or 
(10) again. Since we are only changing one of the right hand side constants of the original LP, 
we can make use of the basis inverse to update the final tableau and employ the dual simplex 
algorithm when necessary to regain feasibility. 

The step-by-step procedure for generating the stage k table is as follows: 

1. Compute the lower bound Kas given by (8). 



ALGORITHM FOR INTEGER PROGRAMMING 603 

2. Form and initialize the vectors TVEC and LVEC where the d^^ entry in TVEC is 
given by 

td = gk(d + LB,,) - gkUB,,) + 5^+,,, 

where d initiallji ranges from to UB^ and where / is chosen such that t^ is the 
minimum value greater than or equal to V. (Exclusion Test A.) The value of / is 
recorded as the d^^ entry in LVEC. 

3. Compute upper bounds H^ and Z as defined by (9) and (10) and use them to elim- 
inate any infeasible decisions. 

4. Flag all entries having the smallest state value in TVEC. 

5. If the smallest state value is less than or equal to both Wand_Z, go to step 6. Other- 
wise, apply Exclusion Test B to tighten bounds on >';. If LBi^ > UBi, the stage k 
table is complete. Otherwise go to step 4. 

6. Enter values for the STATE, NDEC, and DECISIONS as one row of the stage k 
table. 

7. Update the flagged entries in the TVEC to the next largest possible state value for 
that decision by increasing / by 1 and update LVEC accordingly. If the d^'" entry in 
TVEC is flagged, the updated TVEC it J and LVEC (/,/) are given by 

^ '(/ + "^/i+i.i+i ~ '^/t+i./ 
id = id + 1 
Go to step 4. 

Example 

To illustrate the computations in generating the set of candidate solutions, consider 
the following problem: 

Enumerate: >'i + 3^2 + 2^3 ^ 20 
^j, < 5 
^ y2 ^ 6 where the constraint set Ay ^ b is: 

< J'3 ^ 6 J'l +>'2 +^3 ^ 8 

yei+ 

The computations follow the step-by-step procedure outlined above (beginning at 
stage 3) and produce the following tables: 



STAGE 3 






STAGE 2 






STAGE 1 


STATE NDEC 


DECISIONS 


STATE 


NDEC 


DECISIONS 


STATE 


NDEC DECISIONS 


1 







18 


2 




4,6 


20 


3 0, 1, 2 


2 1 




1 


19 


1 




5 


21 


2 0, 1 


4 1 




2 


20 


2 




4, 6 


22 


2 0, 1 


6 1 




3 


21 


1 




5 






8 ! 




4 


22 


1 




6 






10 1 




5 














12 1 




6 















604 CD. PEGDEN & C.C. PETERSEN 

Candidate solutions are recovered from the tables by tracking through the tables begin- 
ning at Stage 1 and working towards the last stage. This tracking process can be thought of as 
generating a combinatorial "tree" of solutions for a specified starting or "goal" state. The nodes 
of the tree correspond to a given stage and state, and the branches emanating from the node 
correspond to the alternate decisions for that stage and state. A path through the tree 
represents an assignment of integer values to each stage of the problem. For example, with a 
state value of 22 we have two candidate solution^ y^ = 0, y2 == 6, y^ = 2 and ^'i = 1, J2 = 6, 
>'3 = 3, the latter being non-feasible to the Ay ^ b constraint. 

6. IMPLICIT ENUMERATION OF CANDIDATE SOLUTIONS (PHASE IV) 

In generating the set of candidate solutions, we have excluded only state values and 
assignments which could be shown to be infeasible or non-optimal to our problem. Therefore 
the optimal feasible solution to our problem is necessarily contained within the set of feasible 
and possibly infeasible solutions. Our strategy is to search for a solution within the set which is 
feasible with respect to the constraints of our problem. To guarantee that the first feasible solu- 
tion found is also optimal, the search is performed starting with the largest state value at Stage 
1 and working towards the smallest state value at Stage 1. 

For a given goal state, the number of candidate solutions is simply the number of paths 
emanating from the corresponding state of Stage 1. For simple trees explicit enumeration, by 
substituting each candidate solution into the constraint set of the problem and testing for feasi- 
bility, is quite practical. Due to the combinatorial nature of the tree, this approach can become 
computationally overburdening for larger problems. We will therefore employ a method for 
implicity evaluating candidate solutions. Through the application of a fathoming test, large por- 
tions of the combinatorial tree will be exempted from enumeration. 

The implicit evaluation procedure starts by selecting the largest state value at Stage 1; this 
is termed the present goal and there will be a corresponding tree. The examination of paths 
through the tree is performed by the systematic assignment of values to the >^-variables at each 
stage, starting at the first stage by assigning a value to >']. A partial evaluation at Stage j is 
defined as the assignment of integer values from the first stage node through the f^ stage node, 
inclusive. The state contribution resulting from this partial evaluation is designated ZINT. All 
paths through the tree will yield the present goal, but not all paths will yield ^'-values that 
satisfy the constraints of the original problem. 

Our purpose is to devise a test to detect as early as possible in the search if a particular 
branch (and its sub-branches) cannot yield a candidate solution feasible to the constraints of the 
original problem. We accomplish this by comparing the goal state value to the sum of ZINT 
and the continuous maximal solution (ZCONT) for the ^-variables not yet assigned values. 
Since ZCONT is greater than or equal to any feasible integer completion for the unassigned y- 
variables, if the sum of ZINT plus ZCONT is less than the present goal state, we can exclude 
all integer completions of this partial evaluation from consideration. At this point the branch is 
said to be "fathomed" and we "backtrack", that is, go back to the preceding node and evaluate 
the remaining branches emanating from it. 

If the present goal is achieved, by completing a path through the tree without violating 
any constraints of the original problem, the current assignment to y is added to the lower bound 
vector (LB) of x and the algorithm terminates. If the present goal is not achieved, that is, if 
no feasible path through the tree exists, then the next largest state value at Stage I is used as 
the goal state and its applicable tree is searched. The algorithm will terminate because the 
range of state values generated is bounded such that it included at least one feasible j'-vector. 



ALGORITHM FOR INTEGER PROGRAMMING 605 

The implicit evaluation method described above requires a computationally efficient pro- 
cedure for computing ZINT + ZCONT at each partial evaluation. One method would be to 
compute ZCONT by solving for the unassigned >'-variables as a linear programming problem. 
For many problems this would necessitate solving a large number of linear programming prob- 
lems.- However, by viewing each assignment to j as a change to the right hand side of the con- 
tinuous LP, ZINT + ZCONT can be conveniently computed using sensitivity analysis. 

7. COMPUTATIONAL EXPERIENCE 

The performance of an integer programming algorithm is measured by its ability to solve a 
wide class of integer programming problems within reasonable computer time and storage limi- 
tations. To provide a basis for comparison with existing algorithms, the GIPC2 procedure was 
programmed in ANSI FORTRAN and implemented on the Purdue CDC 6500 computing sys- 
tem. Instructions for its use and a FORTRAN listing of the program are provided in reference 
[9]. 

The GIPC2 algorithm was evaluated on a set of seventeen linear and seventeen nonlinear 
test problems. Although the algorithm was specifically designed to solve the nonlinear integer 
programming problem, we were interested in evaluating the performance of the algorithm on 
linear problems as a special case. To provide a basis for comparing with existing linear integer 
programming algorithms, the seventeen linear test problems were also solved using a Branch 
and Bound code. 

The main difficulty in comparing the computational efficiency of different integer pro- 
gramming algorithms is in developing a representative test problem set containing problems of 
varying size and difficulty. It is important to note that the relative performance of two integer 
programming algorithms may be highly dependent upon the test problem set used. In addition, 
it should be noted that problem size is only one factor in determining problem difficulty, and 
;this factor is often dominated by problem structure. A problem with only five variables can be 
;5ignificantly more difficult to solve than a problem with twenty-five or more variables. 

I The set of linear test problems used in this investigation include four problems containing 
ive variables each developed by Haldi [4], and thirteen additional problems of larger size. The 
our problems of Haldi, despite their small size, are difficult problems to solve and have been 
ised extensively as a test bed for integer programming algorithms. Problem number five is a 
system design problem given by Petersen [10] and contains fourteen integer variables. The 
emaining twelve problems were randomly generated and range in size from ten variables to 
wenty-five variables and differ widely in their difficulty to solve. None of the test problems 
lave explicit upper bounded variables. 

The Branch and Bound code used in the investigation is the MINT mixed integer pro- 
;ramming algorithm [6] based on the BBMIP code developed by the IBM Corporation [5] for 
he IBM 360 models 25 and above. The program is written in FORTRAN and is based upon 
ihe Dakin improved procedure of Land and Doig. A more modern code such as MPSX- 
4IP/360 or UMPIRE was not available on the Purdue CDC system, or it would have been 
jsed for a more meaningful comparison. 

i The computation times for the seventeen linear test problems are presented in Table 1. 
Ml times are in seconds and are for the Purdue CDC 6500 computing system. Times given in 
he table that are preceded by a greater-than sign indicate that the respective algorithm ter- 
■ninated without an optimal solution .established after that amount of computation time. 



606 



CD. PEGDEN & C.C. PETERSEN 



TABLE 1. Computational Experience — Linear 



Problem 
Number 


Number of 
Constraints 


Number of 
Variables 


Computation Time (sees) 


GIPC2 


MINT 


1 


4 


5 


.545 


4.099 


2 


4 


5 


.400 


2.972 


3 


6 


5 


.608 


3.375 


4 


6 


5 


.434 


3.457 


5 


8 


"14 


7.453 


36.297 


6 


5 


10 


.811 


20.221 


7 


5 


10 


1.042 


21.001 


8 


10 


10 


.804 


.537 


9 


10 


10 


.888 


1.488 


10 


5 


20 


4.195 


>188. 


11 


5 


20 


3.803 


30.250 


12 


10 


20 


10.261 


32.882 


13 


10 


20 


10.430 


3.549 


14 


5 


25 


7.422 


>188. 


15 


5 


25 


5.610 


30.758 


16 


10 


25 


21.545 


32.364 


17 


10 


25 


64.497* 


5.440 



'Reduced to 13.3 seconds by reordering variables in ascending order of ifieir 
domain (see suggested modification in Section 8 below). 

The GIPC2 code clearly outperformed the MINT code in solving the test problems of 
Haldi. Note that the MINT code required more time to solve problem 1 of Haldi containing 
only five variables than it required to solve problem 13 containing twenty variables. The test 
problems of Haldi clearly illustrate that problem structure can be more significant in determin- 
ing problem difficulty than problem size. 



In test problems 5 through 17, neither algorithm computationally dominates the other. 
The results for test problems 5, 6, 7, 9, 1 1, 12, 15, and 16 tend to indicate that the GIPC2 code 
is an average of 8.7 times faster than the MINT code, and the performance in problems 10 and 
14 shows GIPC2 vastly superior. However this conclusion is contradicted by the results of test 
problems 8, 13, and 17 where the MINT code is 2.3 times faster than the GIPC2 code. The 
performance of the GIPC2 and MINT codes on these problems illustrates the unpredictable 
performance that is associated with integer programming algorithms. 



A significant point of superiority of the GIPC2 code, however, is illustrated by compara- 
tive results on problems 10 and 14. Although the Branch and Bound procedure has been 
employed successfully to solve a number of large problems [2], it is sometimes misled into tak- 
ing the wrong path early in the search. As a consequence, the Branch and Bound procedure 
can require an excessive amount of computer time to solve even relatively small problems. 
The MINT code failed to solve problems 10 and 14 after 188 seconds of computation. The 
computational results to date tend to indicate that the GIPC2 algorithm is less susceptible to 
getting sidetracked with large running times. 

The performance of GIPC2 algorithm in solving integer programming problems with 
separable nonlinear objective functions was investigated by solving a set of seventeen nonlinear 
test problems. The nonlinear test problems were generated by using the constant arrays from 
the linear problem set with five or more of the linear terms in the objective function being 
replaced by nonlinear terms. Problems 18 through 21 each contain five variables and were 



ALGORITHM FOR INTEGER PROGRAMMING 



607 



constructed from the problems of Haldi by replacing the linear objective function with five non- 
linear stages. Problem 22 is a nonlinear version of the system design problem given by Peter- 
sen. The remaining twelve problems each contain from twenty-five to fifty variables and either 
five, fifteen, twenty, twenty-five, forty, or fifty nonlinear stages. 

Computation times for the seventeen nonlinear test problems are given in Table 2. The 
computation times compare favorably with computation times for linear problems of similar 
size. Note that test problems 32 and 34, containing forty and fifty nonlinear stages respectively 
and ten constraints, each solved in less than twenty seconds. The data tends to suggest that the 
integer programming problem with nonlinear objective function is of relatively the same 
difficulty for the GIPC2 algorithm as the linear integer programming problem. The ability of 
the GIPC2 algorithm to solve the integer programming problem containing a separable non- 
linear objective function in roughly equivalent times to that required to solve the linear integer 
programming problem is one of the primary contributions of the research reported here. 

TABLE 2. Computational Experience — Nonlinear 



Problem 
Number 


Number of 
Constraints 


Number of 
Variables 


Number of 

Nonlinear 

Stages 


GIPC2 

Computation 
Time (sees) 


18 


4 


5 


5 


.202 


19 


4 


5 


5 


.212 


20 


6 


5 


5 


.203 


21 


6 


5 


5 


.215 


22 


8 


14 


12 


10.312 


23 


10 


25 


5 


4.905 


24 


10 


25 


5 


15.494 


25 


10 


25 


15 


7.550 


26 


10 


25 


15 


20.204 


27 


10 


25 


20 


6.998 


28 


10 


25 


25 


5.522 


29 


10 


25 


25 


7.299 


30 


10 


25 


25 


9.756 


31 


10 


40 


25 


12.493 


32 


10 


40 


40 


11.892 


33 


10 


50 


25 


17.596 


34 


10 


50 


50 


18.717 



The application of the nonlinear capability of the GIPC2 algorithm to a practical problem 
is illustrated by test problem 22, the nonlinear version of problem 5. In the original problem 
■ the system maintenance and operating costs which are to be minimized were assumed to be a 
ilinear function of the number of system components by type. However, in many systems the 
maintenance and operating costs are a nonlinear function of the number of system components 
iby type. The restrictive linearity assumption is imposed primarily as a consequence of the lack 
of practical algorithms for solving the nonlinear integer programming problem. However the 
.GIPC2 algorithm solved the more descriptive nonlinear version of the systems design problem 
fin 10.312 seconds as compared to 36.297 seconds required by the MINT code to solve the 
[linear version of the problem. 



A major difficulty encountered in evaluating GIPC2 for solving nonlinear integer program- 
tming problems is in verifying that, the solutions obtained are indeed optimal. The nonlinear 



608 CD. PEGDEN & C C. PETERSEN 

test problems are difficult problems to solve and alternate methods of solution apparently do 
not exist. To verify the accuracy of the GIPC2 algorithm in solving the nonlinear integer pro- 
gramming problem, a relatively simple ten-variable, five-constraint, nonlinear integer program- 
ming problem was exhaustively enumerated. The enumeration required approximately thirty- 
five minutes of computation time on the Purdue CDC 6500. The problem was solved by the 
GIPC2 algorithm yielding the same solution in approximately two seconds. 

8. CONCLUSIONS * 

The generalized implicit enumeration scheme described in this paper can solve both linear 
and nonlinear integer programming problems. Computational experience indicates that the 
presence of nonlinearities has little or no affect on the computational efficiency of the algo- 
rithm. This attribute of the GIPC2 algorithm should allow for the formulation and solution of 
integer programming problems which fully consider the economies to scale which exist in the 
world. 

A modification which would facilitate the use of the GIPC2 algorithm in solving larger 
problems is the replacement of the present simplex subroutine with a revised simplex method 
possessing implicit upper bounding procedures for the variables. This would allow the initial 
data matrix of the problem to be stored in external storage and would also avoid the need for 
inclusion of explicit upper bound constraints on the variables. This last feature would be partic- 
ularly useful in solving zero-one integer programming problems. 

A simple modification to the GIPC2 code that would result in considerably reduced com- 
putation time consists of incorporating a scheme for automatically reordering the variables in 
ascending magnitude of their domain prior to generating the set of candidate solutions. As a 
consequence of this reordering, the trees of candidate solutions would tend to be sparse near 
the top (Stage 1). As a result the number of partial evaluations examined would be reduced. 
The effect of this modification is illustrated by test problem 17 which originally required 64.5 
seconds to solve. After manually reordering the variables in ascending order of their domain, 
the problem was solved in 13.3 seconds. 

REFERENCES 

[1] Balas, E., "An Additive Algorithm for Solving Linear Programs with Zero-One Variables," 
Operations Research, Vol. 13, pp. 517-546 (1965). 

[2] Forrest, J.J.H., Hirst, J.P.H. and Tomlin, J. A., "Practical Solution of Large Mixed Integer 
Programming Problems with UMPIRE," Management Science, 20, No. 5, pp. 736-773 
(1974). 

[3] Geoffrion, A., "Objective Function Approximations in Mathematical Programming," Dis- 
cussion Paper No. 61, Management Science Study Center, University of California, LA 
(May 1976). 

[4] Haldi, J., "25 Integer Programming Test Problems," Working Paper No. 43, Graduate 
School of Business," Stanford University (December 1964). 

[5] IBM Catalog of Programs for IBM System 360 Models 25 and Above, GC 20-1619-8, Pro- 
gram Number 360D-15.2.005. 

[6] Kuester, J. and Mize, J., Optimization Techniques with FORTRAN iMcGraw-HiW, 1973). 

[7] Marsten, R. and Morin, T., "A Hybrid Approach to Discrete Mathematical Programming," 
Sloan School of Management, Working Paper 838-76 (March 1976). 

[8] Morin, T. and Marsten, R., "An Algorithm for Nonlinear Knapsack Problems," Manage- 
ment Science, Vol. 22, No. 10 (1976). 



ALGORITHM FOR INTEGER PROGRAMMING 609 

[9] Pegden, CD., "An Implicit Enumeration Algorithm for Solving Integer Programming 
Problems with Linear or Nonlinear Objective Functions," Ph.D. Dissertation, Purdue 
University (August 1975). 

[10] Petersen, C.C, Systems Planning and Evaluation Techniques, Textbook in preparation. 

[11] Plane, D.R. and C. McMillan, Discrete Optimization (Prentice-Hall, Inc., New Jersey, 
1971). 

[12] Taha, H., "A Balasian-Based Algorithm for Zero-One Polynomial Programming," Manage- 
ment Science, Vol. 18, No. 6 (1972). 

[13] Watters, L.J., "Reduction of Integer Polynomial Programming Problems to Zero-One 
Linear Programming Problems," Operations Research 15, 1171-1174 (1967). 



DUALITY FOR QUASI-CONCAVE PROGRAMS 
WITH APPLICATION TO ECONOMICS 

T. R. Jefferson, G. M. Folic, and C. H. Scott 

University of New South Wales 
Kensington, N.S. W., Australia 

ABSTRACT 

A duality theory is developed for matliematica! programs with strictly 
quasi-concave objective functions to be maximized over a convex set. This 
work broadens the duality theory of Rockafeliar and Peterson from concave 
(convex) functions to quasi-concave (quasi-convex) functions. The theory is 
closely related to the utility theory in economics. An example from economic 
planning is examined and the solution to the dual program is shown to have 
the properties normally associated with market prices. 



1. INTRODUCTION 

Although duality theory for linear programming has been well developed and widely used 
for some time, it is only in recent years that significant advances have been made in duality 
theory for convex (concave) programs. Notable contributions have been made by Rockafeliar 
[13] and Peterson [12]. Despite these developments, there are still many programming prob- 
lems that are not encompassed by the existing theoretical developments. One such important 
class of mathematical programs are quasi-concave programs, and it is the purpose of this paper 
to extend the benefits of duality theory to this class of programs. 

In 1967, Arrow and Enthoven [1] developed necessary and sufficient conditions for the 
optimality of quasi-concave programs. Later Luenberger [11] developed a duality theory for 
quasi-concave programs, which separated primal and dual variables. This duality theory was 
expanded by Greenberg and Pierskalla into surrogate duality [5], [7]. 

The duality theory developed here is valid for quasi-concave programs with closed strictly 
quasi-concave objective functions. This work is motivated by the dual relationship between 
goods and prices first observed by Roy [14] and by Peterson's work in duality theory [12]. The 
major result of this paper, lies in the separation of the objective function from a linear con- 
straint set, which simplifies the derivation of the dual program, as well as the relationship 
between the primal and dual programs. Furthermore, the duality theory developed here is 
widely applicable to a class of problems found in economics. 

Duality theory comes naturally to linear programs via Kuhn-Tucker theory and the linear- 
ity of the problem. In general, the existence of non-linearities in mathematical programs raises 
a number of problems and makes generalizations more complex and difficult. Wolfe duality is 
an example of this problem. For concave programs, the concave conjugate transform can be 

611 



612 T.R. JEFFERSON, G.M. FOLIE & C.H. SCOTT 

used to derive a dual program, and to develop all the associated primal-dual relationships 
(Peterson [12]). 

In order to clarify the difference between the duality theory developed in this paper, and 
that currently used for concave programs, as well as why a special duality theory is needed, the 
following brief digression will be made. Consider the following definitions. 

DEFINITION: The concave conjugate transform of a function gix) defined on a set Cis 
the pair [//, D] defined by 

h(y) A inf {< x. y > - gix)} 

~ ,v€C 

D A {^^ I inf < X, >' > — gix) > — °°). 

= vec 

DEFINITION: The hypograph of a function gix) is the set: 

{ix. fi)\x^C, f3 ^ gix)]. 

DEFINITION: A concave (quasi-concave) function is closed when its hypograph is a 
closed set. 

DEFINITION: The supergradient of a function g at a point x is the set dgix) defined by 
dgix) = {y I gix) + < y, z - X > ^ giz). V z e C). 

By construction //(>') is a closed concave function. In addition for x € C and j € Z) we |l 
have the following inequality: ' 

0) gix) + hiy) ^ < X. y > 

The inequality (I) is an equality when 

X e d hiy) OT y ^ d gix). 

The concave conjugate transform generates very strong relationships. If gix) is not con- 
cave, the concave conjugate transform still operates on gix) as if it were concave. The conju- 
gate transform of a non-concave function, gix), does not use the hypograph of gix), but only 
the convex hull of the hypograph of gix). Thus some information regarding the properties of 
gix) is lost by this transform. 

This undesirable feature of the conjugate transform, indicates the need to develop a new 
transform, which can be used to derive a duality theory for non-concave programs. 

At first, it may not appear to be a particularly serious limitation that the conjugate 
transform cannot be used with non-concave functions. However there are cases where a 
transform that will handle quasi-concave programs is required. For instance, the conjugate 
transform cannot be used to derive dual programs for problems in economic theory. The rea- 
son is that economic theorists have, over the years, reduced the restrictiveness of both consu- 
mer and producer theory. The fundamental property of the utility function in the theory of 
consumer behaviour, which stems from the axioms of weak preference ordering, is that 
indifference curves, or constant utility surfaces define convex sets. Equivalently, economists 



DUALITY FOR QUASI-CONCAVE PROGRAMS 613 

require these indifference curves to have the property of diminishing marginal rates of substitu- 
tion (Green [4]). Thus the minimal property of any utiUty function, used to represent consu- 
mer choice behaviour, is quasi-concavity. Thus in order to derive any dual programs for 
economic problems, it is essential that the transform used to obtain the dual is valid for quasi- 
concave programs. 

The transform derived here has these desired properties, and will be called the utility 
transform. 

The properties of the utility transform are derived in the next section and are then used 
to develop a duality theory for quasi-concave programming. This theory is an extension of the 
duality theory developed by Luenberger [11]. An example from economics is presented at the 
end of the paper in order to elucidate the usefulness of the utility transform. 

2. THE UTILITY TRANSFORM 

A form of duality between prices and commodities in consumer theory was originally 
noted by Roy [14] in 1947. Roy's work explicitly developed a dual relationship between the 
consumer's direct utility function, which is a function of his commodity bundle, and an indirect 
utility function, which is a function of the prices of the commodities and consumer income. 
Recently this theory has been used to provide a clearer understanding of consumer theory by 
proving, in a simple manner, a large number of propositions in consumer theory. Lau [10] and 
Diewert [3] provide a useful compendium of this work. Although the concept of the utility 
transform stems from the relationship between the direct and indirect utility functions, the pur- 
poses of this paper require the development of a slightly different approach to duality. Consider 
the pair [u, U], of utility function u{x) defined on the convex set U. 

DEFINITION: The utility transform of [u, U] is a pair [v, V] defined by 
! v(p) A inf {- u{x)\ < p. X > ^ 0} 

V ^ {plM [- uix)\ < p, X > ^0] > - oo}. 

— x€U 

! By construction, for x ^ U, p ^ V and < ;?, x > ^ we have the utility inequality 

holding: 

uix) + vip) < 0. 
I 

! 

The construction of v(p) and the indirect utility function differ in the following ways. 
Firstly the linking constraint 

< p. X > ^ 

is generally referred to as the budget constraint, but usually in consumer theory, it has a posi- 
tive right hand side. However, as will be seen later, it is convenient to absorb the right-hand 
side into the inner product. This can be done by identifying consumer income as another com- 
modity, which although the consumer has an endowment of it, does not wish to have it for its 
i own sake. A more general interpretation, is to consider x to measure the quantity of goods and 
[Services that a consumer buys and sells. We use this second approach in the example (section 
1,4) • Finally we take the infimum of — uix) rather than the supremum of u{x). These 
differences allow us to work with quasi-concave functions, vip) is quasi-concave, whereas the 
indirect utility function is quasi-convex. More importantly though, the absorption of the right- 
hand side of the budget constraint permits a complete separation of the price and commodity 
1 variables. 



614 T.R. JEFFERSON, G.M. FOLIE & C.H, SCOTT 

It is accepted that the utility transform is not the only method for handling duality. 
Greenberg and Pierskalla [5] developed a surrogate dual, which in terms of the notation used 
here, is defined by 

in 

inf {- u{x) I Tp,{g,{x) - b,) < 0} 

where g/ix) ^ b, is a constraint to be satisfied. Clearly, when g,ix) — b, is replaced by x,, then 
the above expression is the same as the utility inequality. 

The surrogate dual was further specialized for quasi-concave functions in [7] by Green- 
berg and Pierskalla to the z-quasi-conjugate ' 

u*ip) = z + inf {— u{x) \ < X, p > ^ z]. 

This becomes the utility transform when z = 0. In their paper on "Quasi-Conjugate Func- 
tions and Surrogate Duality" [7], Greenberg and Pierskalla develop the properties of the z- 
quasi-conjugate. This paper formed the basis of a further analysis by Crouzeix into the proper- 
ties of quasi-concave functions [2]. 

The z variable in the z-quasi-conjugate is difficult to handle in the dual. In order to take 
cone conditions into consideration, it is necessary to have < x, p > ^ 0. This means that 
z = 0, and we are left with the more convenient utility transform. 

We now develop the properties of [v, V]. In order to do this we require a relaxation of 
the concept of supergradient. 

DEFINITION: The local supergradient of a quasi-concave function w(x), x € (/ is a set 
6'°^w(x) defined by: 

^loc / \ A f I 1- u{x + a Ax) - u{x) . ^ . ^WA 

b w(x) A [p I lim < < Ax, /7 > V Ax 

such that X -h /3 Ax € U. ji > 0]. 

The local supergradient is a generalization of the concept of supergradient presented ear- 
lier. For the usual case of differentiable functions, the local supergradient contains a single ele- 
ment: the gradient. It is through the local supergradient that [w, U] and [v, V\ can be related. 

It is necessary to know this relationship in order to develop the duality theory presented 
in the next section. The relationship between [u, U] and [v, V] is formally expressed by 
Theorem 1, which is stated at the end of this section and proved in the Appendix. In order to 
prove Theorem 1, the following four Lemmas are needed. 

LEMMA 1: \{p) is quasi-concave and positively homogeneous of degree zero. K is a 
convex cone. Q'"*"" \{p) is positively homogeneous of degree minus one. 

PROOF: See appendix. 

LEMMA 2: For v, the utility transform of u, hypo v is closed if hypo u is closed. 

PROOF: See appendix or [7]. 



DUALITY FOR QUASI-CONCAVE PROGRAMS 615 

LEMMA 3: For u closed, p e K, x € t/ and < /?, x > ^ we have the utility inequal- 
ity 

^2) u{x) +yip) ^ 0. 

When equality holds in (2) we have: 

(i) \p e 9'°^«(x), X ^ 
(ii) fjLX € d^°'w(p), At ^ 0. 

PROOF: See appendix. 

LEMMA 4: Suppose w is a closed strictly quasi-concave function on U with utility 
transform [v, V]. 

Then either 

(i) The maximum of u is attained at a point z ^ U and \{p) = — u(z) for p ^ V O 
{p \ < p, z > ^ 0}. Let ;; be equivalent to qip = q) if there exists a > such that p = aq. 
\{p) is a strictly quasi-concave function on V r\ [p \ < p, z > ^ Q] with respect to the quo- 
tient space defined by this equivalence relation, or 

(ii) The supremum of u{x) is infinity. \{p) is a strictly quasi-concave function on V 
with respect to the quotient space defined in (/). See appendix for proof. 

THEOREM 1: Let [u, U] have the following properties: 

(i) the hypograph of u is closed. 

(ii) w(x) is strictly quasi-concave. 

(iii) [v, V] is the utility transform of {u, U]. 

(iv) z € (/is the optimal point of sup [u{x)] if it exists. 

Given that x € U, p ^ V and < p, x > ^ 0, x ^ z then: w(x) -I- v(p) = if and only 
if either 

(I) \^ € 9'°^«(x), X > 
or 

(II) (a) p ^ S = V, or V n {p \ < p. z > ^ 0) if z exists 

(b) p X e a'°'v(^), I' > 

(c) < p, X > = 

(d) u (x) = sup { w (>») I >' = X or y = - x} 

y € U 

See the appendix for proof of theorem. 



616 T.R. JEFFERSON, G.M FOLIE & C.H. SCOTT 

3. DUALITY THEORY 

Consider the following quasi-concave program. 

PROGRAM A max. u{x) 

subject to X 6 U f) X 

where w is a closed strictly quasi-concave function defined on the convex set U, and x is a con- 
vex cone. We assume that if z is such that u{z) = sup u{x) then z i x^ together with 

(/ n X 5^ and sup [u{x) \x € U C) x) < °°- The economic dual to Program A is: 

PROGRAM B max. v(p) 

subject to p € V n n 

where [v, V] is the utility transform of [u, U], and n is the dual cone to x defined by, 

n = [p\< p, X > ^0, \f X € x)- 

At optimality the following relations hold: 

x€Unx, P^Vf^u 

< p, X > = 

uix) + vip) = 

Kp 6 d^°'u{x). \ > 

p X ^ a'°W(^), i^ > 

u ix) = sup {w (j') I >» = X OT y = — x). 

An interpretation of these optimality conditions will be given, when the example is discussed in 
the following section. 

THEOREM 2: The previously mentioned optimality conditions are necessary and 
sufficient for optimality for Programs A and B. 

4. EXAMPLE 



1 



The theory of economic planning is concerned with devising an allocation of resources j( 
which maximizes a given social welfare function of the society in question, given that the 
society has a prescribed quantity of endowments of labour, together with some consumption i 
goods remaining from the previous period. The society has available a set of known production j 
technologies, which take inputs, such as labour, and transform them into consumer goods. The 
use of some goods as both production inputs and consumer goods is not precluded. Problems 
of this type are discussed by Heal [8]. 



Thus in a directive economy, the central planning office must eff"ectively solve a large 
mathematical programming problem. In order to illustrate the duality concepts developed ear- 
lier, consider a simplified version of the planning problem which contains all the essential ele- 
ments of economic planning. The problem is defined as follows: 



DUALITY FOR QUASI-CONCAVE PROGRAMS 617 



max w(x + e) ^ w(x; e) 
subject to X ^ — e 

X = Az; z ^ 

c 

x' 
[ere, superscripts denote vectors and subscripts denote scalars. 



where x = 



X 

.1 



wix; e) is a known quasi-concave social welfare function which captures the preference 
rderings that this society has for its consumption of goods, x'' + e^\ and the use of leisure, 
' + e'. The society has an endowment of leisure, e' which it can forego, in quantities x' to 
rovide labour for productive activities to produce more consumer goods, x', which increase 
jgregate social welfare. It should be noted that the nature of the coefficients of the production 
;tivity matrix. A, will ensure that only positive quantities of consumer goods will be produced, 
nd that leisure will be consumed by production, x' ^ 0; i.e. only positive quantities of labour 
ill be supplied. 

This problem will now be expressed in terms of the format developed in Section 3 in 
rder to illustrate the relationship between the primal and dual programs. This will be done for 
)cial welfare functions which are assumed to be log-linear. The problem can now be expressed 
I a form similar to that referred to as PROGRAM A in the previous section: 

max wix; e) = ^ a, log(x, + e) 

subject to X € U D x 

where f/ = {x I x, > — e,, V/) 

! 

i and, X = {^ I -^ = ^z; z ^ 0}. 



(early x is a cone, which is defined by the technological possibilities available to the economy, 
nus, this particular formulation of the primal problem treats production through the cone, 
hich, as will be shown below, results in a considerable simplification. 

To obtain PROGRAM B, the economic dual, it is necessary to derive the utility transform 
w{x\ e): 

inf {- w(x; e) | < p, x > < 0}. 

xeU 

;)rming the Lagrangian, and then differentiating with respect to x, and X, the following results 
iie obtained: 

a,ix, + e,)-^ -\p, = V/ 

I < ;?, X > = 0. 

From simple algebra, it can now be shown that 



618 

and 



T.R. JEFFERSON, G.M. FOLIE & C.H. SCOTT 



e, + 



Yfi^, 



Z«- 



V/. 



Before continuing, the discussion can be simplified, by assuming that the a, are selected! 
so the ]^a, = 1, since all the expressions derived clearly imply that the a, are normalized. 

Substituting for x,, the utility transform \{p\ e) of w(x; e) is obtained: 



v(/?; e) = ^a, log 
V =[p\p,^Q, V/} 



1 


p, 


«, 


Z Pi ^i 

1 



Thus, the economic dual to the original problem can now be written in the form of PRO- 
GRAM B given in the previous section: 



max ^ a, log 



1 



;', 



subject to /7 e K n n 

where V = [p\ p, ^ Q, V/} 

and n = {p M' /7 ^ 0) I 

where 11 is the polar cone of x ^^^ '^s derivation follows from the well known properties or 
finite cones. 

It is of some interest to provide an interpretation of this dual program. The most impor- 
tant property that emerges is that the dual program is expressed solely in terms of the dual vari- 
ables, p, while retaining all the basic parameters that defined the primal program. It needs little 
appeal to one's intuition to interpret the dual variables, p, as some type of price vector. Unfor- 
tunately this, in itself, does not provide much insight, since the dual programs to various prob- 
lems, for example, linear programming and posynomial programming, generate dual variable; 
with quite diff'erent interpretations. 

For dual programs derived by use of the utility transform, the key lies in the requiremen 
that at optimality < p, x > =0. The optimizing process implicit in the utility transform i^ 
identical to the well known optimizing problem in economics in which a consumer selects hii 
most preferred commodity bundle, subject to the restriction that his expenditure must no 
exceed his income. Thus, it seems plausible to interpret the dual variables, /?, as market prices; 
A careful examination of, not only the dual program, but also the relationships between the pri 
mal and dual variables at optimality, indicates that this is a valid and useful interpretation. 

As a consequence of the utility transform [v, V], the dual variables, p, must be non 
negative, which is a requirement for a sensible price system. 

The dual variables, p, are related to the primal variables, x, by the relationship 

bw{x\ e) 



kp, = 



dx, 



DUALITY FOR QUASI-CONCAVE PROGRAMS 619 

rhe Lagrange multiplier, \, from the utility transform appears, because even if the optimal 
olution to the primal problem, x, is known the magnitudes of the resulting prices depend on 
he units of measurement used in w(x\ e). 

This relationship indicates that in order to have an absolute measure of the dual variables, 
}, when the solution, jc, to the primal problem is known, it is necessary to know, \, which 
rises in the utility transform. By examination of the utility transform, it is clear that the 
.agrange multiplier, X, can be interpreted as the marginal utility of society's endowment, and 
ts magnitude clearly depends on the units in which the social welfare, w, is expressed. In the 
ilanning problem considered here, it can easily be seen that \ = i/Y,Pi^n which is the recipro- 

al of the value of society's endowment. Furthermore, as A. is neither a primal nor a dual vari- 
ble, it is a linking variable. As indicated earlier, consumer preferences for different commo- 
lity bundles can be expressed as an ordinal function, which need only be quasi-concave. If it 
vere accepted that utility could be measured absolutely, and was not merely an ordering con- 
ept, then one could assume a concave utility function and then use a duality theory based on 
he conjugate transform. Similar comments are valid for the social welfare function, w, which 
5 also ordinal. 

If commodity / = 1, is designated as the numeraire good, then a set of relative prices can 
e used to define the optimality condition 

_A dwjx; e) / bw{x\ e) w. 

P\ dx, I bxx ' 

"hus, at optimality, the familiar result emerges; namely that the relative price of good / (in 
jrms of good 1) is equal to the marginal rate of substitution of good / for good 1, MRS,,. A 
lose examination of the dual social welfare function, v^p\ e) indicates that it is similar to the 
idirect utility referred to earlier. The difference here is that the dual social welfare function 
as the term, J^ pjCj, which is clearly the market value of society's endowment given a price 

ector p. This is similar to the notion of income, which is used in conventional consumer 
leory. As an aside, it should be noted that the indirect utility function as used by Lau [10] in 
Dnsumer theory can be expressed in the form of \(p\, e) if consumer income, Y, is assumed to 
e the only endowment, with a price of 1, and all other goods are purchased by the sale of the 

■■adowment (income). 

I 

Keeping this discussion in mind, and accepting the interpretation of the dual variables, at 
3timality, as market prices, it is now possible to provide an insight into another optimality con- 
tion: f X € V v{p; e). It can be readily shown, t- = ^P/e,. This optimality condition results 

!a set of relationships similar to the demand equations found in consumer theory. For the 
anning problem being analysed here, this condition tells us that if an optimal solution to the 
jal is known, p, then the quantity of consumer goods that will be produced, x', and the 
nount of leisure foregone, x', to produce these goods is obtained by this differential. 

The condition of primal and dual feasibility, together with the requirement that the link- 
g condition, < p, x > = 0, holds at optimality, can be used to develop some insights into 
e production sector. 

imal feasibility: x e U n x '^ x '^ - e and x = Az; z ^0. 

ual feasibility: p e V n U =^ p ^ and A'p ^ 0. 



620 T.R. JEFFERSON, G.M. FOLIE & C.H. SCOTT 

The linking condition, < p, x > = 0, is interpreted as a trade balance constraint which ensures 
that the value of the goods produced is equal to the value of the compensation paid for the 
inputs used to manufacture these goods. 

By direct substitution, < p, x > = < p, Az > = < A'p, F > = 0. 

From the primal and dual feasibility conditions, z ^ and p ^ 0, this means that if' 
< A'p, z > = 0, then each term of this scalar expression must be zero. This is only possible 
when for each production activity J : 

if Y.'^'iP' < 0, then z, = 

/ 

or if Y, ^ijPi ^ ^' ^^^^ ^/ ^ ^• 



Thus by use of the optimality conditions, derived in section 3, the familiar complementary 
slackness conditions emerge. These can be given the usual economic interpretation. If a partic- 
ular activity J is included in the optimal plan, then Zj > and thus ^a,jPi = 0, which means 

that there are no excess profits and the value of the labour (foregone leisure) used in this par- 
ticular production activity is equal to the value of the output. On the other hand if a particular 
activity is uneconomic, Ij = 0, then the value of the labour services used to operate the activity 
at unit level is greater than the returns from the products produced by the activity. Again, the 
results that emerge are the standard ones encountered in the economic theory of market 
behaviour. 

Thus it can be seen that the planning problem formulated at the beginning of this section 
can be viewed in a completely different manner. An alternative solution to the original plan- 
ning problem, which required the planners to issue the directives, 3c, to everyone in society, 
would be to solve the dual program to obtain a set of prices, p, which can be interpreted as 
market prices. Then the planners need only announce these prices, p, and the members of 
society, by responding to these price signals will generate an allocation of resources equivalent 
to that which would have occurred under a directive, x. 

ACKNOWLEDGMENT 

The authors wish to thank the referee for his helpful comments and for pointing out the: 
important paper [7]. ' 

BIBLIOGRAPHY 

[1] Arrow, K.J. and A.C. Enthoven, "Quasi-Concave Programming," Econometrica, 29, 779-1 

800 (1967). 
[2] Crouzeix, J. P., "Contributions a I'etude des fonctions quasi-convexes," (Ph.D Dissertation, 

University of Clermont, France, 1977). 
[3] Diewert, W.E., "Applications of Duality Theory" in Frontiers of Quantitative Economics, Vol| 

II, M.D. Intriligator and D.A. Kendrick, Editors (American Elsevier Publishing Co..' 

New York, 1974). 
[4] Green, H.A.J. , Consumer Theory, (MacMillan, London, 1976). 
[5] Greenberg, H.J. and W.P. Pierskalla, "Surrogate Mathematical Programming," Operations 

Research, 18, 924-939 (1970). 



DUALITY FOR QUASI-CONCAVE PROGRAMS 621 

[6] Greenberg, H.J. and W.P. Pierskalla, "A Review of Quasi-Concave Functions," Operations 

Research, 79, 1533-70 (1971). 
[7] Greenberg, H.J. and W.P. Pierskalla, "Quasi-Conjugate Functions and Surrogate Duality," 

Cahiers du Centre d'Etudes de Recherche Operationelle, 75, 437-448 (1973). 
[8] Heal, G.M., The Theory of Economic Planning, (American Elsevier, New York, 1973). 
[9] Jefferson, T.R., G.M. Folic and C.H. Scott, "Dual Games," School of Mechanical and 
Industrial Engineering Report (1977). 
[10] Lau, L.J., "Duality and the Structure of Utility Functions," Journal of Economic Theory, 7, 

374-396 (1969). 
[11] Luenberger, D.G., "Quasi-Convex programming," SIAM Journal on Applied Mathematics, 

76, 1090-1095 (1968). 
[12] Peterson, E.L., "Geometric Programming," SIAM Review, 18, 1-52 (1976). 
[13] Rockafellar, R.T., Convex Analysis (Princeton University Press, Princeton, New Jersey, 

1970). 
[14] Roy, R., "La distribution du revenu entre les divers biens," Econometrica, 75, 205-225 
(1947). 

! APPENDIX 



PROOF OF LEMMA 1: Consider any two points p\ and pj and < \ < 1. 
v(Xp, -I- (1 - k)p2) = inf [-u{x) I < \p, + (1 - \) P2, X > ^ 0) 

^ min [ inf {- «(x,) | < /?,, x, > ^ 0}, inf [-u{xt) \ < pj, Xj, > < 0}] 
= min [y(pO. vipi)] 

This proves the quasi-concavity of v(;7). 

Consider X > 0, then 

\(kp) = inf {-u(x) \ < \p, X > ^ 0} 

x^U 

= inf {-u(x) I < A ^ > =^ 0} = v(;7). 

xeu 

"hus \(p) is positively homogeneous of degree zero, 
''is a convex cone by construction. 

-et X 6 9'°W(-y/7), y > 0. By definition 

^x 1- yiyp+aAp) ^ . . . 

3) lim - — '— *^- ^ < t^p, X > . 

a— a 

ince \{yp) is positively homogeneous of degree zero, (3) implies 

4) v(/7 -h a-^) - \{p) 

lim "^ < < Aa X >. 

a— a 



ii" 



622 T.R. JEFFERSON, G.M. FOLIE & C.H. SCOTT 

Substituting in (3) A*? = — — we obtain 

y 

(5) v(p +a^q)- v(p) ^ ^ ^^^ ^^ ^ 

a 

Thus by (5) we have proven that x € Q'*"-" vCy;?) it" and only if yx € 8'°'^v(/7). Similar proper- 
ties are proved for the surrogate dual in [5]. 

PROOF OF LEMMA 2: By definition, we have that 

hypo V = [{p, /3) 1/7 6 K /3 < v{p)]. 
Let {{p,, /3,)} be a convergent series with 

lim {pi, j8,) = (^, /3) and (/?,, /8,) € hypo v, for all /. 



We require to show that {p, /3) is a member of hypo v. Assume that (p, /3) ^ hypo v. 
There are two possibilities to consider: (i) p i V and (ii) /3 > v(^). For case (i), 
lim /3, = — oo by definition. This contradicts the assumption that {(/?,, /3,)} is convergent. 

Hence p ^ V. \n case (ii), we let (x,, a,) be such that 

v(/7,) = — w(x,) = — a,, /3, ^ a, V/. 

This is admissable by the definition of V. We let 

lim {x,, a) = {x, a) 

where {(x,, a,)) and (x, a) belong to hypo w; the latter since hypo u is closed. Hence 
a = w(Jc) and v(^) = — w(3c). This in turn implies that /3 ^ v(p), which is a contradiction. 

Hence hypo v is closed. A similar lemma is proved in [7]. 

PROOF OF LEMMA 3: By the utility transform we have 
(6) v(;;) = inf [-u{x) | < p, x > ^ 0}. 

The solution to (6) is the solution to the saddlepoint problem 

\{p) = inf sup [\ < p, X > — u{x)\. 

The first order conditions require that 

\p € a'°^w(x), \ ^ 0. 
Consider now the transform of \(p) 

inf sup {/ji < p, X > — v{p)}. 

The first order conditions require: 

/x X e 9'°'^v(p), At ^ 0. 



DUALITY FOR QUASI-CONCAVE PROGRAMS 623 

PROOF OF LEMMA 4: Suppose that there exists z € (/such that 

m(z) = sup [u(x)} is defined 
xeu 

\(p) = inf [-u(x) \< p, X > ^0). 

xeU 

Thus \(p) = -w(z) for < A z > ^ 

Consider P\, Pi ^ V r\ [p\ < p, z > ^ Q] such that /?, ^ /72 and ;7i ^ -p2- 

Choose < X < 1 . 

v(\;7, + (1 - \)p2) = inf {-«(x) | < \p, + (1 - \)/72. x > < 0} 

> min [ inf {-w(x,) | < /?,, x, > ^ 0}, inf {-w(x2) | < P2. ^2 > < 0)] 
= min [v{p\), y{p2)] by the strict quasi-concavity of u. 

Suppose sup [u{x)] is undefined. 

'Consider p^, pj € Ksuch that px ^ pi- 

Choose < X < 1 

v(X/>i + (1 - X)/72) = inf [-u{x) I < X;?, + (1 - \)p2, x > ^ 0} 

> min [\{p\), v(p2)] by the strict quasi-concavity of u. 

PROOF OF THEOREM 1: First assume 
'''> uix) +v(p) =0 

(7) implies \p € 9'°'^m(3c) and vx € 9'°'^v(p) by Lemma 3. X and i^ are positive because 
'(3c), \ip) are strictly quasi-concave. 

Suppose ^satisfies < ^, z > < 0. Since x ^ z 

^^ {;? I < A X > ^ 0, < p, z > > 0} ;^0 

(p) is strictly quasi-concave on this set. For p belonging to the set defined by (8), 
ip) > vip). This contradicts the utility inequality. Therefore (a) must hold. 

Suppose < p, X > < then by the strict quasi-concavity of w(x) there exists on x such 
lat < p, X > = and w(x) > w(x). This contradicts the utility inequality. Therefore (c) 
lust hold. 

Suppose u (x) < sup {w (>') I >■ = x ov y = —x). 
'lis too would contradict the utility inequality. Thus (d) must hold. 



624 T.R. JEFFERSON, G.M. FOLIE & C.H. SCOTT 

Now going the other way suppose we have kp € 9'°'^w(3c). 

Consider 

(9) v(^) = inf [-uix) \< p, X > ^0). 

By property (i) if the infimum to (9) exists it is attained on f/ n {x | < p, x > < 0}. Because 
w(x) is strictly quasi-concave a local minimum is a global minimum. Therefore 

\{p) = -«(x). 

Suppose now (Ila-d) hold. \{p) is strictly quasi-concave in the sense of Lemma 4 on S. 
Therefore (b) and Lemma 2 imply j 

-v(^) = inf [-v{p) I < A ^ > < 0}. 

Let jc satisfy (d) in addition. Let \p € 9'°'^«(jc), X > 0. 6'°'^w(x) is non-empty because of pro- 
perty (i). 



Consider 
(10) 

The infimum of (10) is attained at 3c by construction. 



inf {-u{x) I < A X > ^ 0}. 



Therefore \{p) = —u{x) and < ^. x > ^ 0. By construction —\{p) < —v(p) if p ^ p. 
This implies w(x) + v(p) > which contradicts the utility inequality. Therefore v(^) = b(p) 
and 

uix) + vip) =0. 

Note that if x = z and p satisfying < ^, x > ^ and p € V will satisfy the utility in- 
equality. \ and p are equal to zero and we lose the relationships I and II between the primal 
and dual variables. The primal problem reduces to one of global minimization which is rela- 
tively straightforward. 

PROOF OF THEOREM 2: Assume x is in optimum for Program A. Thus there exists a 
vector p such that Kp € d'°'^w(x), X > 0. < \p. Ax > < for Ax € x- Since 3c € x, 
< Xp, 3c > ^ 0. Thus p € n n Kby construction and Theorem 1. Also by Theorem 1 

uix) + vip) = 0. 

Also by Theorem 1 the remaining conditions hold since < p, Ax > < V A x € x- Now 
suppose the optimum for Program B is /?*€ K fi n such that v(;7*) > vip). This would con- 
tradict w(3c) -I- \ip*) ^ which is impossible or imply < x, p* > > 0. This too is impossible 
since 3c € x and p* e n. Therefore p is optimal, for Program B. 

Suppose now we have p and optimum for Program B. Thus there exists a vector x such 
that p X € 9'°''v(p) j^ > 0, uix) = sup {«(>') I >- = x or >' = -x) and < x, Ap > < for 

^p € n. Since ^ € 11, < x, ^ > ^ 0. By construction and Theorem 1 x e x (^ t/. Also by 
Theorem 1 



^ 



DUALITY FOR QUASI-CONCAVE PROGRAMS 625 

uix) +vip) =0, 
and the remaining optimality conditions hold. 

Suppose the optimum for Program A is x* £ U (1 x such that u(x*) > uix). This 
would contradict either uix*) + v(^) < which is impossible or imply < x*, ^ > > which 
contradicts the feasibility of x*and p. The result is proved. 



ON THE EXISTENCE OF JOINT PRODUCTION FUNCTIONS* 

Rokaya Al-Ayat 

Lawrence Livermore Laboratory 
Livermore, California 

Rolf Fare 

Department of Economics 

Southern Illinois University 

Carbondale, Illinois 

ABSTRACT 

Within a general framewori< of production correspondences satisfying a set 
of weai< axioms necessary and sufficient conditions for the existence of a joint 
production function are given. Without enforcing the strong disposabiiity of in- 
puts or outputs it is shown that a joint production function exists if and only if 
both input and output correspondences are strictly increasing along rays. 

Joint production functions are frequently used in economics, however, it was not until 
Shephard in [6] defined such a notion within the general framework of production correspon- 
dences that its meaning became clear. The question of existence of these functions, dealt with 
in this paper, is yet to be settled. On this issue Shephard [8] wrote, "The joint production func- 
tion is a tricky concept, seemingly simple but not shown to exist except under very restrictive 
conditions." 

I For a production technology with strongly disposable inputs and outputs Bol and Moesch- 

lin [2], showed that continuity of both the input and the output correspondences together with 
essentiality of all inputs are sufficient for the existence of a joint production function. Later Bol 
in [1] showed that such a function would also exist if the essentiality condition is replaced by 
strict increasancy of the output correspondence in all inputs. 

K '" 

It is to be recalled that an output correspondence x — P{x) € 2 ^ is a mapping from 

R '" 

input vectors x € R" into subsets P (x) € 2 ^ of all output vectors obtainable by x. Inversely 
to P(x) the input correspondence u — L{u): = {x\u € P{x)] is the set of all input vectors x 
yielding at least an output vector u. In this paper the existence of a joint production function 
will be considered under the weak axioms as stated in [7]. Specifically neither the strong dis- 
posabiiity of inputs or outputs (i.e., x' > x € L{u) =^ x' 6 L{u), u'< u € P{x) =^ n' € 
P{x) respectively) nor convexity of P{x) or L{u) are enforced. Having strong disposabiiity of 
•nputs means that if a subvector of inputs is kept constant while the remaining are increased. 



This research has been partially supported by the Office of Naval Research under Contract N000i4-76-C-0l34 with the 
University of' California. Reproduction in whole or in part is permitted for any purpose of the United States Govern- 
ment. 

627 



i 



628 R AL-AYAT & R. FARE 

output will never decrease implying there can be no congestion in the production system. In 
addition, strong disposability of outputs excludes their null jointness (see [9]); i.e., each output 
must be producible when others are not produced. Thus having only weak disposability of 
inputs (i.e., P{X ■ x) D Pix),\ ^ 1) and outputs (i.e., Li9 ■ u) C L{u). 9 ^ \) allow 
modelling of both congestion and null jointness. 

As defined by Shephard [6], the joint production function relates input and output iso- 
quants to each other. Recall that 



ISOQ Fix): = {u\u € P{x), 9 ■ u € P{x), 9 > \\, P(x) ^ [0], 



and 



ISOQ Liu): = {x\x € L{u),\ ■ x €L{u), X < \], Liu) ^[Q], Liu) 7^ 0. 

DEFINITION: The function F : IR|" x R^ ^ R+ such that 

(1) for u" ^ 0, ISOQ Liu") = [xlFiu^.x) =0}, L iu") ^ and 

(2 ) for x" ^ 0, ISOQ Fix") = {w|f (u.x") = 0), Fix") ^ {0} 

is a joint production function. 

An equivalent statement to the definition, to be used in the sequel, was proved by Bel 
and Moeschlin [2] namely: 

LEMMA: A joint production function Fiu,x) exists if and only if for all x ^ 0*", 
Fix) ^ {0} and u ^ 0, Liu) ^ 0, u e ISOQ Fix) <##> x € ISOQ Liu). 

THEOREM: For all x ^ 0, w ^ such that Fix) 7^ {0}, I(w) ?^ with x — Fix) 
iu — ' Liu)) satisfying the weak axioms, a necessary and sufficient condition for the existence 
of a joint production function Fiu,x) is 

(*) ISOQ Fix) n ISOQ Fix ■ x) = ISOQ Liu) D ISOQ Li9 ■ u) = 

for all positive scalars X, 9 ^ 1. 

PROOF: To show the necessity of (*), assume there is a joint production function Fiu.x) 
and let u € ISOQ Fix) n ISOQ FiX • x). By the lemma, x € ISOQ Liu) and \ • x 6 
ISOQ Liu), X ^ \, which is a contradiction. Thus if a joint production exists, ISOQ Fix) D i 
ISOQ Fix ■ x) is empty for all positive scalers X, X ^ \. A similar argument can be used to ; 
show that the existence of Fiu.x) implies that for all positive 9, 9 ^ \, ISOQ Liu) fi ISOQ 
Li9 ■ u) is empty. 

To show the sufficiency, assume that (*) holds, and that for x > 0, Fix) ^ {0}, u € 
ISOQ Fix) but X ^ ISOQ Liu). From the definition of the isoquant, there exists a X < 1 
such that X ■ X e ISOQ Liu) implying that u e Fix ■ x). But from the weak disposability of 
inputs Fix ■ x) C Fix) which together with (*) implies that u ^ ISOQ Fix), a contradiction. 
Similarly it can be shown that having ISOQ Liu) D ISOQ Li9 ■ u) empty would guarantee that 
X e ISOQ Liu) =^ u ^ ISOQ Fix). Hence the sufficiency of (*) for the existence of a joint 
production function is proved. See lemma. 

Q.E.D. 



^ 0. 



JOINT PRODUCTION FUNCTIONS 629 

I Continuity of" the production correspondences has not been enforced. However, following 

an argument similar to that used by Bol and Moeschlin in [2] one can prove: 

COROLLARY: If a joint production function exists, then both the input and the output 

correspondences are continuous along rays i.e., Pik" • x) = U P(\ • x) and 

o<\<\<' 

L{9° ■ u) = \J L{9 ■ u) respectively, with w, x ?^ 0. 

e>9'' 

Note that continuity along rays together with strong disposability imply continuity (see [2] 
for definition). 

y Next, consider the production technology; 

P{x\, xj): = {{(w,, 0)} U {(0, W2))|0 < w, < x,, i = 1,2} 
and inversely 

L(wi, wj): = {{(x,, 0)} U {(0. X2)\\x, > u,, i = 1,2). 

The corresponding isoquants are given by 

ISOQ Liu^. U2) = {{(x,,0)} U {(0, X2)}|x, = u,, i = 1,2} 
land 

ISOQ P(xi, X2) = {{(wi, 0)} U {(0, U2)} I u, = X,, / = 1,2}. 

In this example, the production correspondence satisfies the weak axioms, but neither strong 
disposability of inputs and outputs nor the essentiality condition (i.e., P{x) p^ [0] implies 
(x|, X2) > (0,0)) used in [2] hold. Yet it is clear that a joint production function exists. 

Finally, an example not satisfying the sufficiency conditions applied in [1] and [2] is given. 
iBefore introducing it the following proposition to be used, is proved. 

PROPOSITION: If the production function (/>(x): = max {u\x € Liu)], is continuous 
and strictly increasing along rays in the input space R|', ISOQ Liu) = {x|0(x) = u), u > 0. 

" PROOF: Clearly ISOQ Liu) C {x\(f>ix) ^ w}, w > ; let x" € {x|0(x) > u}. Since 
is continuous along rays, {x|0(\ ■ x") > u] is open implying that x" i ISOQ Liu), hence 
ISOQ Liu) C {x\<t>ix) = u}. Next assume x"^ ISOQ Liu), u > 0, then since (t> is strictly 
increasing along rays, if x" € Liu), there is a \ < 1 such that </>(X • x") = w implying that 
x'>i {x|0(x) = U}. Qgj^ 

Now, consider the output correspondence x — ► Pix) C [0, + 00), 
Pix): = {« € /? + : w < <A(x)} 

A ■ [i\ - 8) (x, - y xj)-" + 8X2^^]-'/'' 
0(x): = jfor (xi — y Xj) > 
otherwise 

pheTQ the parameters of the WDI — production function 0(x) are A > 0, 8 € (0,1), 
y € (0,00) and p € (-1,0) (see [3]). For these values of the parameters, 0(x) is upper semi- 
continuous which is equivalent to Pix) being upper hemi-continuous (see [5], p. 22) also 






630 R AL-AYAT & R. FARE 

X2 = does not imply P{x) = (0) and </> is not increasing in xj. Thus P{x) does not meet the 
continuity requirement of [1] and [2] nor does it meet the other sufficiency condition of [2] 
(essentiality of all factors) or [1] (strict increasancy in all factors). 

Using the proposition above the isoquants of Fix) and Liu) are easily computed to be, 
ISOQ Pix) = {u\u = (/)(x)} and ISOQ Liu) = [x\(t>ix) = u}. 

Thus, X € ISOQ Liu) «##► u € ISOQ Pix), showing that under the weak axioms for a pro- 
duction technology, the sufficient conditions found in [1] and [2] need not hold for a joint pro- 
duction function to exist. 

ACKNOWLEDGMENT 

The authors sincerely thank Professor Ronald W. Snephard for his suggestions and helpful 
comments. 

BIBLIOGRAPHY 

[1] Bol, G., "Produktionskorrespondenzen und Existenz Skalarwertiger Produktionsfunktionen 

bei der Mehrgiiterproduktion," Karlsruhe, (1976). 
[2] Bol, G. and O. Moeschlin, "Isoquants of Continuous Production Correspondences," Naval 

Research Logistics Quarterly, Vol. 22, pp. 391-398, (1975). 
[3] Fare, R. and L. Jansson, "On VES and WDI Production Functions," International Economic 

Review, Vol. 16, pp. 745-750, (1975). 
[4] Fare, R. and L. Jansson, "Joint Inputs and the Law of Diminishing Returns," Zeitschrift fiir 

Nationalokonomie, Vol. 36, pp. 407-416, (1976). 
[5] Hildenbrand, W., Core and Equilibra of a Large Economy, (Princeton University Press, 

1974). 
[6] Shephard, R.W., Theory of Cost and Production Functions, (Princeton University Press, 

1970). 
[7] Shephard, R.W., "Semi-Homogeneous Production Functions," Lecture Notes in Economics 

and Mathemi-Mcal Systems, Volume 99, Production Theory, Berlin, Springer-Verlag, 

(1974). 
[8] Shephard, R.W., "On Household Production Theory," ORC 76-24, Operations Research 

Center, University of California, Berkeley, (1976). . 

[9] Shephard, R.W. and R. Fare, "The Law of Diminishing Returns," Zeitschrift fur |* 

Nationalokonomie, Vol. 34, pp. 69-90, (1974). 



THE PURE FIXED CHARGE TRANSPORTATION PROBLEM 

John Fisk 

School of Business 

State University of New York at Albany 

Albany, New York 

Patrick McKeown 

College of Business Administration 

University of Georgia 

Athens, Georgia 

ABSTRACT 

The pure fixed charge iransporlation problem (PFCTP) is a variation of the 
fixed charge transportation problem (FCTP) in which there are only fixed costs 
to be incurred when a route is opened. We present in this paper a direct search 
procedure using the LIFO decision rule for branching. This procedure is 
enhanced by the use of 0-1 knapsack problems which determine bounds on par- 
tial solutions. Computational results are presented and discussed. 

INTRODUCTION 

The pure fixed charge transportation problem (PFCTP) deals with the optimal allocation 

if supply 5, available at source / = 1, 2, ... , m in order to meet demand D, at destination 

= 1, 2, . . . , n. Before any goods can be shipped from / to j a fixed charge, /„, must be paid. 

'he objective is then to minimize the total cost of shipping available goods to meet the 

equired demands. 

One would wish to solve the pure fixed charge transportation problem in those situations 
here the cost to transport goods over an arc must be paid as a lump sum rather than as per 
nit costs. An example would be leasing trucks to move goods between supply points and 
emand points. So long as the amount demanded is less than the capacity to the truck for that 
Dute, the cost of moving any amount of goods greater than zero is approximately the same, 
e., the leasing cost, fuel, the driver's salary are fixed. In this case, we would wish to deter- 
line the set of routes which would allow us to satisfy total demand at a minimum possible sum 
f these lump or fixed costs. 



Mathematically the problem may be formulated as follows: 

m n 

I) Minimize ^ = Z L /'/->''/ 



/=! y=i 



subject to 52 ^'/ = ^1 /■ = 1, 2 m, 

./ = ! 



631 



632 J. FISK & p. MCKEOWN 

m 

(3) L^'/ = A y = 1, 2, ... , «, 

(4) x„ ^ 0, V,, 



(5) y, = 



1 if x„ > 
otherwise 



We are assuming that ^ •S', = ^ Z), and that the /,/s are integer. 

,=1 /=i 

In this paper, we will present a direct search procedure for solving the PFCTP which util- 
izes the unique structure of the constraints, (2) and (3), to derive bounds that lead to an 
efficient search over the possible solutions. In Section II, we will discuss solution procedures 
for a closely related problem, the fixed charge transportation problem, as they relate to the 
PFCTP and suggest a procedure for solving the PFCTP. Section III will discuss development of 
bounds and presents the iterative procedure. Computational results are presented and discussed 
in Section IV. 



II. SOLUTION PROCEDURES FOR THE FIXED CHARGE 
TRANSPORTATION PROBLEM (FCTP) 

Discussion of the PFCTP in the literature is limited. Numerous techniques for solving 
the closely related fixed charge transportation problem (FCTP) are presently available, however. 
The FCTP can be specified mathematically as follows: 

m n 

(6) Minimize ^ = L Z (/</>''/ + ^u^u^ 

subject to (2), (3), (4), and (5). 

m n 

We also assume that X "^z ^ Z ^/ ^"*^ ^^^^ ^^^ fixed costs, /,/s are integer. In this case, we 
also have the usual per unit transportation costs c,,. 

Research into solving the FCTP may be classified as either heuristic or exact (algo- 
rithmic). We will be interested in the latter. Some of these are Murty [15], Gray [6], Ken- 
nington and Unger [9, 10], McKeown [14], Kennington [10, 11], Barr [2], Frank [5], and 
Steinberg [17]. With the exception of extreme point ranking procedures such as those 
presented by Murty and by McKeown, any of the above procedures could be applied to solving 
PFCTP. Most of these would not be expected to be efficient, either because they are designed 
to solve more general types of fixed charge problems and do not take full advantage of the spe- 
cial constraints (2) and (3), or because they have been shown to be efficient only when variable 
costs dominate fixed costs. Examples of heuristic methods are [1] and [13]. 

Two procedures which would appear to be useful in solving PFCTP, however, are those of 
Gray and Kennington. In Gray's procedure, a series of tests are developed which enable him 
to decrease the number of vertices for which he must find the corresponding feasible solution 
in order to find a satisfactory assignment of routes given specific values of the logic variables 



PURE FIXED CHARGE TRANSPORTATION PROBLEM 633 

y,/. Kennington introduces a branch-and-bound procedure for solving FCTP which employs a 
relaxation corresponding to the following Hitchcock Transportation Problem (TP): 

n n 

(7) Minimize J^= J^ [{f,jlu,^ + c,y]x,^ 

/=1 7=1 

subject to (2), (3), and (4) 

and where «,; = min (5",, Dj). The formulation for TP above was introduced by Balinski [1] in 
an approximate procedure for solving FCTP. Solving TP at each node of his branch-and-bound 
tree, Kennington is able to calculate effective bounds and to determine simple penalties and 
feasibilities useful in directing the search procedure. His methodology could be applied to solv- 
ing PFCTP by simply setting c„ = 0, all for /, j in (7). 



III. DIRECT SEARCH PROCEDURE FOR SOLVING THE PFCTP 

Using the terminology of Geoffrion and Marsten [5] we solve the problem PFCTP by 
separating its set of feasible solutions into subproblems called candidate problems (CP) by 
assigning values to a subset of the variables (>'„). A particular (CP) is fully defined by specify- 
ing the elements contained in each of two sets Jq and 7], which represent the set of transporta- 
tion routes assigned "closed" (i.e., y^ = 0) and "open" (i.e., y,j = 1), respectively. The remain- 
ing elements reside in the set Jj and are referred to as "free." The sets Jq, J^, and Jj are mutu- 
ally exclusive and collectively exhaustive. 

Our enumerative scheme is similar in most respects to a more traditional branch-and- 
bound scheme employing a direct search (single branch) strategy. In constructing our branch- 
ing tree, we proceed through successive levels of the tree by choosing a route from the set Jj 
and assigning it to 7]. We assign this route to Jq in the branching tree only upon backtracking. 
A strict LIFO (last-in, first-out) backtracking rule is observed. 

Three factors critical to the computational efficiency of the above approach are (1) the 
eflFort required to solve for the bound associated with a given candidate problem, and the qual- 
ity of the bound produced, (2) the specification of rules useful in identifying (CP's) which can- 
not be optimal, and (3) the choice of a separation variable from among those in J2. Sections 
Ilia and Illb detail two bounding procedures, the row feasibility test and the column feasibility 
test. These tests are easily applied and, when used in conjunction with one another, can yield 
efficient bounds. Section IIIc specifies a test attributable to Hirsch and Dantzig [8] which serves 
to limit the number of routes assignable to set 7,. Section Illd outlines the rules used for 
selecting a separation variable from Jj. In the discussion that follows, we introduce a set 
Js = {-^0 + -^i) which we refer to as a partial solution to PFCTP. 



i 



Illa. Row Feasibility Test 

We define the cost for row feasibility for row /, RF,, in terms of the least cost set of 
demanders necessary to absorb the supply S, given the partial solution J^. We further define 
AD, as the total demand assigned to row / given the partial solution J^, where 

(8) AD, = £ D,y, 

i 

and all free variables in row / are assumed closed. If AD, ^ S,, then RF, is simply the total 
cost of the open routes in row /, i.e., 

(9) RF^ = i:fuyu- 



634 J. FISK & p. MCKEOWN 

If AD, < 5,, however, we must determine the minimum additional cost which must be 
incurred in order to satisfy a necessary condition for row feasibiUty. To do so, we solve the fol- 
lowing 0-1 knapsack problem relative to the set of unassigned routes from supply / : 



(10) 


Minimize 




(11) 


subject to 




(12) 




u„ = 0. 1 V 



where d, = S, — AD, and Jjii} = the set of unassigned routes from supply /. Then given that 
assigned demand is less than the available supply for row /, the minimum cost necessary to i 
obtain row feasibility becomes j ; 

(13) RF, = Z f,jy„ + n,. ( 

The applicability of the knapsack relation for determining RF, is based upon the ability to i 
solve problems such as (10) - (12) with minimal effort [3], [7], [18]. Such relations can yield M 
efficient bounds and have been successfully applied in solving the generalized assignment prob- 
lem [16] and in solving warehouse location problems [12]. 

Since the row feasibility test described above can be applied for each row (supplier) / 
given the partial solution 7^, the value RF = ^ RF, becomes a lower bound on the sum of j ' 

/ SI 

fixed charges required for a feasible completion to J^. If this value of RF \s greater than or L 
equal to Z (the current best known feasible solution), we have fathomed J.. 

As pointed out previously, the knapsack relation which we employ for determining the 
bound RF requires relatively little computational effort. Even so, the computational efficiency , I 
of our procedure increases as the number of such knapsack problems necessary to solve PFCTP it 
decreases. The paragraph that follows indicates the procedures we employ in order to reduce i| 
the computational cost of using our knapsack relation. 

At the initialization of our procedure— when all routes are considered free— we calculate ' 
RFhy applying (10) - (12) for each row /as previously described, then store the knapsack solu- 
tion for each such row. Thereafter, as we assign a route to be open or closed, we update the I " 
bound RF by adjusting the knapsack solution and its objective value for the corresponding row 
only. Also, upon assigning a route to be open {y,^ = 1) the knapsack relation need be applied . 
only if: *^ 

(0 

(1) AD, < S, and » 

(2) the route (/, J) is not one of the assigned open routes in the stored knapsack solu- k 
tion for its row /. 

jii: 

Similarly, upon assigning a route to be closed iy,, = 0) the knapsack relation need be applied 
only if: 

(1) AD, < S, and ' I la, 

lol 

(2) the route (/, J) is not one of the assigned closed routes in the stored knapsack solu- 

tion for its row /. 



PURE FIXED CHARGE TRANSPORTATION PROBLEM 635 

Illb. Column Feasibility Test 

In determining column feasibility for column j, CF,, we use procedures strictly analogous 
to those described for determining the cost for row feasibility. We define ASj as the total sup- 
ply assigned to column y given the partial solution /j, where 

(14) ASj = £ S^y,j 

i 

and all free variables in column j are assumed closed. If ASj ^ Dj, then CFj is simply the total 
cost of the open routes in column j ; i.e., 

(15) CF,= i:fuy,r 

i 

If ASi < Dj, however, we must determine the minimum additional cost which must be 
incurred in order to satisfy a necessary condition for column feasibility. To do so, we solve the 
following 0-1 knapsack problem relative to the free variables in column J : 

(16) Minimize 11^ = J^ f,i\,j 



(17) subject to ^ 5,v,y ^ d^ 

(18) v„ = 0, 1 X 



'6^,1/1 



where dj = Dj — ASj and 72(y) = the set of unassigned routes to demand / Given that assigned 
supply is less than the necessary demand for column j, the minimum cost necessary to obtain 
column feasibility becomes 

(19) CFj = £ f,y,j + n,. 

The rules for applying the knapsack relation (16) - (18) follow closely those defined for rows in 
the preceding subsection. Also, since this column feasibility test described above can be 
applied for each column (demander) J given the partial solution J^, the value CF = ^ CFj 

J 
becomes a lower bound on the sum of fixed charges required for a feasible completion to J^. 

The best available bound assignable to the partial solution J^ becomes max (RF, CF). 

IIIc. Basis Constraint Test 

Hirsch and Dantzig showed that, for any fixed charge problem, an optimal solution occurs 
as an extreme point of the (continuous) constraint set. This implies that the x,j values 
corresponding to a partial solution of the y,/s must be linearly independent and must not be 
infeasible. Any partial solution which does not satisfy these conditions may be terminated. 
Also, the maximum number of nonzero elements (i.e., routes (/, j) for which y,j = 1) in a 
basic solution is m -I- « — 1. 

Illd. Choice of Separation Variable 

The separation variable >',.y. is chosen from amongst the sets of variables u* = u* and 
v* = V*. We define «*as the optimal set of open variables obtained by solving (10) - (12) for 
each row /for which AD, < 5, given J^, and v*as the optimal set of open variables obtained by 
solving (16) - (18) for each column j for which ASj < D^ given J^. If n,(y) represents the 
objective value of (10) - (12) given the closure of route (/, j) in row /, then p,, = n,(y) — 11, is 



636 



J. FISK & P. MCKEOWN 



the penalty associated with the closure of route (/, j). Similarly, Xlj^) represents the objective 
value of (16) - (18) given the closure of route (/, j) in column j, and Qjj = n^(,) - \ij is the 
penalty associated with the closure of route (/, j). A nonzero penalty need be obtained only 
for those routes included in w*and v*. 

Given the determination of the set of penalties p = p^j associated with the closure of each 
route in w*, the maximum increase in the value of row feasibility /?/" given the closure of any 
route in u* becomes p^ = max (/7,,). Similarly, the maximum increase in the value for 

column feasibility CF given the closure of any route in v* becomes q^ = max ((?,,). The 

('j)ev» 
separation variable >',.^. is therefore that currently unassigned variable whose closure would 
yield the greatest bound associated with J^: 



(20). 



r,.^. = max {RF + p„,, CF + q„). 



In the event that w* and v* are empty sets (i.e., AD, ^ S,, V and AS, ^ Z)y, V^), then the 
separation variable >',.y. becomes the currently assigned variable having minimum fixed cost. 

This completes the discussion of the iterative procedure used, and the set of tests 
employed in order to eliminate partial solutions. For a simple example which illustrates the 
application of these tests within the procedure, the reader is referred to the Appendix. 

IV. COMPUTATIONAL EXPERIENCE 

The algorithm as described here, PURFIX, has been programmed in FORTRAN IV and 
run on the CYBER 70/74 using the time-share mode. A series of 5 x 5 problems similar to 
those originally tested by Kennington [11] were run. These problems had uniformly generated 
supplies and demands over the range 1-999 with uniformly generated costs using various 
ranges. The cost parameters and test results are shown in Table 1 below. In addition, the Ken- 
nington code was obtained for use as a benchmark to determine the relative efficiency of our 
algorithm. These results are also shown in Table 1. All solution times are an average of five 
problems. Also shown for both procedures is the difference between the fastest and the slowest 
solution times for each set of problems (the range). 



TABLE 1 



Problem 
Set 


Fixed Cost 
Range 


Average 

PURFIX 

Time 


Range 


Average 

Kennington 

Time 


Range 


1 


257 -457 


5.964 


12.936 


10.897 


39.859 


2 


614 - 814 


12.884 


21.604 


32.285 


122.542 


3 


1328 - 1528 


8.069 


15.401 


33.665 


81.796 


4 


1231 - 3231 


2.784, 


2.577 


4.626 


5.507 


5 


3463 - 5463 


5.481 


6.212 


8.017 


12.368 


6 


34700 - 36700 


13.239 


26.422 


7.645 


17.143 


7 


66400 - 76400 


8.042 


15.327 


34.021 


81.823 


8 


2570000 - 4570000 


4.427 


9.393 


11.532 


42.555 



All times are in CPU seconds and do not include problem generation. 



PURE FIXED CHARGE TRANSPORTATION PROBLEM 



637 



As may be seen in Table 1, PURFIX is faster than the Kennington code in ail cases 
except one. This case happens to be where the fixed cost range is fairly small compared to the 
magnitude of the fixed costs. Under these conditions PURFIX would be expected to have 
' difficulty distinguishing the optimal solution. In six of the remaining seven cases, PURFIX is 
at least twice as fast. It can also be noted that for both procedures the range values are fairly 
large. This implies that the effectiveness of either procedure for pure fixed charge transporta- 
tion problems is highly dependent upon the particular problem being solved and can vary 
greatly from problem to problem. As with the solution times, the range values are less for 
PURFIX in seven out of the eight cases. 

To test the effectiveness of the PURFIX procedure relative to problem size, we ran six 
sets of three problems each. All sets were similar except for problem size. The fixed charges 
were randomly generated with values between and 10 and demands were generated with 
values between 10 and 100 in increments of 10. The supplies were generated in a similar 
manner in such a way that total supplies equal total demands. For each problem set, there were 
jfive supplies but differing numbers of demands. The results from the computational testing is 
shown in Table 2 with average times (CPU seconds) and ranges being shown for each problem 
set. 

TABLE 2 



Problem Set 


Size 
(m X n) 


Average 
Solution Time 


Range 


1 


5 X 5 


.196 


.248 


2 


5 X 7 


.467 


1.171 


3 


5 X 9 


2.250 


4.045 


4 


5 X 10 


.377 


.362 


5 


5 X 13 


2.451 


2.963 


6 


5 X 15 


*** 


*** 



' In Table 2 we see that while the number of arcs has a definite effect on solution time, it is 
lot always the only determinant of difficulty of solution. This is evident from the fact that 
Dfoblem set four with 50 arcs was solved in less time than that required for problem sets two 
ind three, each having fewer arcs. PURFIX was unable to solve any problems having 75 or 
nore arcs in less than an average of 50 seconds. 

Another factor that could effect ease of solution is the shape of the problem. By this is 
Tieant the relationship between the number of supplies and number of demands. The problems 
ested in Table 2 with the exception of Set 1 were all rectangular problems with more demands 
ban supplies. In Table 3 we have also tested "square" problems, i.e., those with equal numbers 
)f supplies and demands. All characteristics other than shape were the same as for Table 2. 



TABLE 3 



Problem Set 


Size 
(m X n) 


Average 
Solution Time 


Range 


A 


6x6 


.597 


1.230 


B 


7x7 


2.451 


3.489 


C 


8 X 8 


1.873 


1.895 



638 J. FISK & p. MCKEOWN 

If we compare the problem sets in Table 3 to problem sets in Table 2 having approxi- 
mately the same number of arcs, i.e., problem sets 2, 4, and 5, we can get some idea of the 
effect of shape on ease of solution. However, the comparisons do not show any clear difference 
in solution times that could be attributed to the shape of the problem.. 

In summary, these computational results imply that while the size of the problem has a 
definite eflFect, it appears that ease of solution is highly dependent on some combination of 
costs and supplies and demands. The exact effect is unclear but definitely deserves further 
research. 



[1 
[2 
[3 
[4 

[5 

[6 

[7 

[8 

[9 

[10 

[11 

[12 

[13 
[14 
[15 
[16 
[17 
[18 



REFERENCES 

Balinski, M.L., "Fixed Cost Transportation Problem," Naval Research Logistics Quarterly, 
Vol. 8, pp. 41-54 (1961). 

Barr, R.L., "The Fixed Charge Transportation Problem," presented at the joint National 
Meeting of ORSA and TIMS in Puerto Rico (1975). 

Fisk, J., "An Initial Bounding Procedure for use with 0-1 Single Knapsack Algorithms," 
Opsearch, Vol. 14, pp. 88-98 (1977). 

Frank, R., "On the Fixed Charge Hitchcock Transportation Problems," (dissertation), 
Johns Hopkins (1972). 

GeofTrion, A.M. and R.E. Marsten, "Integer Programming Algorithms: A Framework and 
State-of-the-Art Survey," in Perspectives on Optimization, GeofTrion, Ed., Addison- 
Wesley (1972). 

Gray, P., "Exact Solution of the Fixed-Charge Transportation Problem," Operations 
Research, Vol. 19, pp. 1529-38 (1971). 

Greenberg, H. and R. Hegerich, "A Branch Search Algorithm for the Knapsack Problem," 
Management Science, Vol. 16, pp. 327-32 (1970). 

Hirsch, W.M. and G.B. Dantzig, "The Fixed Charge Problem," Naval Research Logistics 
Quarterly, Vol. 15, pp. 413-24 (1968). 

Kennington, J. and V. Unger, "The Group-Theoretic Structure in the Fixed-Charge Trans- 
portation Problem," Operations Research, 21, pp. 1142-1153 (1973). 

Kennington, J.L. and V.E. Unger, "A new Branch and Bound Algorithm for the Fixed- 
Charge Transportation Problem," Management Science, Vol. 22, pp. 1116-1126 (1976). 

Kennington, J.L., "The Fixed-Charge Transportation Problem: A Computational Study 
with a Branch- and -Bound Code," AIIE Transactions, Vol. 7, pp. 241-247 (1975). 

Khumawala, B.M. and U. Akinc, "An Efficient Branch and Bound Algorithm for the Ca- 
pacitated Warehouse Location Problem," Management Science, Vol. 23, pp. 585-594 
(1977). 

Kuhn, H.W. and W.J. Baumol, "An Approximative Algorithm for the Fixed-Charges Tran- 
sportation Problem," Naval Research Logistics Quarterly, Vol. 9, pp. 1-16 (1962). 

McKeown, P.G., "A Vertex Ranking Procedure for the Linear Fixed Charge Problem," 
Operations Research, Vol. 23, No. 6, pp. 1183-1191 (1975). 

Murty, K.G., "Solving the Fixed Charges Problem by Ranking the Extreme Points," Opera- 
tions Research, Vol. 16, pp. 268-79 (1968). 

Ross, G.T. and R.M. Soland, "A Branch and Bound Algorithm for the Generalized Assign- 
ment Problem," Mathematical Programming, 8, 91-103 (1975). 

Steinberg, D., "The Fixed Charge Problem," Naval Research Logistics Quarterly, 17, No. 
2, pp. 217-234 (1970). 

Zoltners, A. A., "A Direct Descent Binary Knapsack Algorithm," working paper #75-31, 
University of Massachusetts (1975). 



PURE FIXED CHARGE TRANSPORTATION PROBLEM 



639 



APPENDIX 

For purposes of illustration, consider the following simple example problem: 



113 












58 




23 
















54 




29 




59 


















12 




70 




69 











45 



19 



92 



64 



21 



Calculation of row feasibility for the first row, RF], requires solution of the following single 
<napsack problem: 

Minimize 111 =Owii + 58wi2 + 23wi3 

Subject to 92wn + 64i/,2 + 21w,3 ^ 113 

u,i = 0, 1, V, 

The optimal solution to the above problem is u* = (un = 1, W13 = 1, u^j = 0) and ITi = 23. 
\dditional knapsack solutions can be obtained for rows 2 and 3 so that the following row feasi- 
!)ility table can be constructed: 



/■ 


n, 


1 


23 


2 


29 


3 


12 



1 ^^ 

^^ 00 




1 ^ 




1 ^ 




1 ^^ 







Lach cell in the table above can be interpreted as follows: a "1" in the upper diagonal indicates 
hat the corresponding transportation route is assigned to be "open" in the optimal knapsack 
blution, while the value in the lower diagonal indicates the penalty associated with closing that 
oute. Empty cells indicate that the corresponding transportation route is "closed" in the knap- 
ack solution for its row. Row feasibility is /?F = J^ n, = 23 + 29 + 12 = 64. A table similar 

3 that for rows can be constructed for columns as follows: 



640 



I. FISK & P. MCKEOWN 



J n, 

1 

2 58 

3 23 



1 ^^ 

^^ oo 






1 ^^ 






1 /^ 

^^36 







Column feasibility, CF = £0^ = + 58 + 23 = 81. 

i 

Since an infinite penalty is associated with the closure of route (1, 1), then ywxs set to 1 
and assigned to J^. Since column feasibility is now obtained for column 1, the next variable is 
chosen from rows 1, 2, and 3 and columns 2 and 3. Since p^ = p-<,\ = 57 and q^ = qxi = 41, 
the next variable assigned to J^ is that currently unassigned variable y,'j* for which /•,.;. = max 
(64 + 57, 81 + 41) = 122 and y,*,' = yxj- Assigning yxi to one and adding it to J^ simultane- 
ously satisfies row feasibility in row 1 and column feasibility in column 2. Row feasibility is 
increased from 64 to 99, since route (1,2) was not in the optimal knapsack solution for calcu- 
lating RF^. 

The procedure continues in a similar manner. The solution tree for our example problem 
is found in Figure 1 below: 



[oo oo] 



[64, 122] 



(1, 1) [64,81] 



(1, 2) [99,81] 



[156,81]/ \(3, 1) [99,93] 




Figure 1. Branching tree for example problem 



(1,3) [122,93] 



(2, 2) [122, 122] 



PURE FIXED CHARGE TRANSPORTATION PROBLEM 



641 



Note that the route assigned at each level of the tree is in parentheses, while the values for row 
and column feasibihty are in brackets. The unit flows associated with the solution obtained in 
Figure 1 are as follows: 



73 


19 


21 




45 




19 







113 



45 



19 



92 64 21 



lOptimal solution value is 122. 



i HEURISTIC ROUTINE FOR SOLVING LARGE LOADING PROBLEMS 

John C. Fisk 

State University of New York at Albany 
Albany, New York 

Ming S. Hung 

Cleveland State University 
Cleveland, Ohio 

ABSTRACT 

The loading problem involves the optimal allocation of n objects, each hav- 
ing a specified weight and value, to m boxes, each of specified capacity. While 
special cases ot" these problems can be solved with relative ease, the general 
problem having variable item weights and box sizes can become very difficult 
to solve. This paper presents a heuristic procedure for solving large loading 
problems of the more general type. The procedure uses a surrogate procedure 
for reducing the original problem to a simpler knapsack problem, the solution 
of which is then employed in searching for feasible solutions to the original 
problem. The procedure is easy to apply, and is capable of identifying optimal 
solutions if they are found. 



I. INTRODUCTION 

The loading problem involves the optimal allocation of n objects / = 1, 2, ... , n, each 
aving a given value c, and weight w,, to m boxes, j = I, 2, . . . , m, each having capacity 6y. 
leveral types of loading problems exist, as indicated in Eilon and Christofides [4]. Two of 
hese are: 

PROBLEM 1 (PI): Given that ^ bj '^ ^ w, determine the minimum number of 

J i 

boxes required to accommodate all items. 

PROBLEM 2 {P2): Given that £ *, < X ^' ^or £ 6,^ £ w, but not all objects 

J ' J i 

can be accommodated), determine the maximum value of 

objects accommodated in the boxes. 



The integer program for problem (PI) can be written as follows: 

1) (PI) minimize ^ dj yj 

j 

i 



!2) subject to J) x,j = 1 , V/ 



643 



644 J C. FISK & M.S. HUNG 

(3) £ w,x,j ^ b, yj, \fj 

i 

(4) >;, = 0, 1 V/, x„ = 0, 1, V/, J 



where 



1 if box j contains one or more objects 
otherwise 

1 if object / is placed in box J 
otherwise 

and dj is the cost of box / For d^ = 1, Vy the problem reduces to determining the minimum 
number of boxes required to hold all objects. If dj = bj, Vy, the above problem becomes that 
of determining the minimum capacity set of boxes required to hold all objects. 

For (P2) the integer programming problem is 

(5) (P2) maximize ]^ c, x,y 

(6) subject to ^x,j < 1, V / 

J 

(7) £ w, x,j < b,, \/j 

i 

^^^ x,, = 0,l.V/, 7 

Eilon and Christofides present a heuristic procedure for solving a special case of problem 
(PI) in which d^ = 1, V,. In addition, they introduce an enumerative algorithm based on the 
work of Balas [2] which yields satisfactory results. A more efficient algorithmic procedure for 
this problem which again takes advantage of uniform box costs is presented by Hung and 
Brown [10]. 

For solving problem (P2), Ingargiola and Korsh [12] introduce an ordering relation which 
allows a reduction in the amount of searching required within an enumerative scheme. Hung 
and Fisk [11] present procedures which rely on Lagrangian and surrogate relaxations to yield 
good bounds in a branch and bound scheme. Similar procedures have been developed by Mar- 
tello and Toth [13]. Each of these procedures appears to yield satisfactory results as long as the 
number of items is small (^ 100) and the number of boxes does not exceed three. 

This paper presents a simple and effective heuristic procedure for solving loading prob- 
lems (PI) and (P2) of much larger scale than those that have been attempted before. The pro- 
cedure is similar to that of Glover [9] in that it uses surrogate constraints to obtain some feasi- 
ble solutions, but it has two distinctive features usually not found in heuristic procedures. One 
is that our procedure uses the surrogate constraints to reduce the problems (PI) and (P2) to 
simpler problems which in fact are the well known knapsack problems. Then we use the solu- 
tions to the knapsack problems to reduce the set of variables to be considered later on. 
Another feature is that our procedure will identify optimal solutions if they are found. More 
specifically, if the reduced set of variables produces a feasible solution then we know that the 
solution is optimal. 



SOLVING LARGE LOADING PROBLEMS 645 

II. SURROGATE RELAXATIONS FOR (PI) AND (P2) 

The concept and applicability of surrogate relaxation were introduced by Glover [7], [8] 
while useful refinements were suggested by Balas [3] and Geoffrion [6]. Surrogate relaxation in 
its simplest form is to replace a set of constraints by a single constraint (the surrogate con- 
straint). For example, for problem (PI) a nonnegative vector of real numbers a = (a^) can be 
used to aggregate the n constraints in (3) into a single one, 

(9) Y. Z "; ^i ^ij ^ Z «7 ^ yj 

J ' J 

Similarly for (P2) , a nonnegative real vector tt = ivj) can be used to combine the n con- 
straints in (7) into the following, 

(10) £ Y ^j ^' ^u ^ Z ^J bj 

' J J 

Let (Pla) and (P2^) respectively denote the surrogate relaxations of (PI) and (P2). The 
relaxations (Pl„) and (P2^) can provide good lower bounds to the optimal solution for their 
respective original problems given a suitable choice of multipliers. Balas [3], Geoffrion [6] and 
Hung and Fisk [11] have shown that one suitable choice is to set them equal to the optimal 
dual multipliers of the aggregated constraints of the linear programming problem. For example, 
let TT = (Ifj) represent the set of optimal dual multipliers of constraints (6) in the linear pro- 
gram of (P2). The linear program is obtained by replacing the 0-1 constraints (8) with unit 
intervals < Xy ^ 1 for all /, / Hung and Fisk [11] showed that 

if J = c,/w, for all J 

where t is the smallest object index such that 

/ < r j 

Items are assumed to be ordered such that 

ci/wi ^ C2/W2 ^ ... > cjw„. 

For (PI), if a = (aj) represents the set of optimal dual multipliers of constraints (2) in the 
linear program of (PI), then: 

a = dulbu for all j 

where w is the smallest box index such that 



I 



rhe boxes are assumed ordered such that 

d,/b, < ^2/62 < . . . =^ djb, ^ .^ djb„ 



Since a, is a constant for all j, constraint (9) can be simplified and the surrogate problem 
P1-) becomes a single knapsack problem as follows: 

(P1-) minimize ^ di yj 
j 
^^ subject to £ w, X, = 5^ w, < £ bj yj 

' ' j 

■'■ yj = (i, 1 V7 



i 



646 J.C. FISK & M.S. HUNG 

where: 

J 
Similarly, (P2-) has the following simple form: 

(P2-) maximize ^ c, x, 

subject to ^ w, X, ^ S ^/ 

' ,/ 

X, = 0, 1. V/ 

X, = £x„. 



where: 



The single knapsack problems defined within (PI5) and (P2-) can be easily solved (see 
e.g., Ahrens and Finke [1], Fisk [5]). It is clear that an optimal solution to the relaxed prob- 
lem, either (PI5) or (P2-), may violate the original constraints — (3) in (PI) and (7) in 
(P2) — because the assignment of items to boxes is ignored in the relaxed problems. How- 
ever, if a feasible assignment can be found among the items and boxes identified in the optimal 
solutions to the relaxed problems, then optimal solutions to the original problems are found. 
Let 5 represent a set of items while T represents a set of boxes. For (PI) let 

S ={all items) 

T ={all boxes for which jy = 1 in the optimal 
solution to (PI-)} 



while for (P2) let 



S ={all items for which x, = 1 in the optimal 
solution to (P2-)} 



T ={all boxes} 

When solving (PI), if an assignment of 5" to T'ls found which satisfies constraint sets (2) - (4) 
then this assignment represents a feasible and optimal solution to (PI). Similarly, when solving 
(P2), if an assignment of 5 to T is found which satisfies constraint sets (6) - (8) then this 
assignment represents a feasible and optimal solution to (P2). The following section describes 
a procedure which searches for feasible and (therefore) optimal assignments of 5 to T. 



III. AN EXCHANGE ROUTINE FOR (PI) AND (P2) 

The exchange routine to be described is similar in many respects to the heuristic algo- , 
rithm presented by Eilon and Christofides. For a particular problem, the procedure uses only {I 
the sets S and T as previously defined, and searches for an assignment of items which is feasi- 
ble for the original problem. If such a feasible assignment is found, then it also represents an 
optimal assignment to the original problem. 



SOLVING LARGE LOADING PROBLEMS 



647 



The exchange procedure we use is the following: 

STEP 1: Solve (PI 5) [or (P2-) if the problem is to solve (P2)] and identify the set of items S 
and set of boxes T. Place the set of items in 5 in a list in descending order according 
to weight. The set of boxes in Tcan be placed in a new list in any order. 

STEP 2: Take the first item in the list and attempt to place it in a (randomly selected) box. If 
no items are in the list, the optimal solution has been found; stop. If sufficient space 
remains to accommodate the item in the box chosen, go to STEP 4: If insufficient 
space remains however go to STEP 3. 

STEP 3: Attempt to place the item in one of the remaining boxes. If such a box exists, go to 
STEP 4; otherwise, go to STEP 5. 

STEP 4: Record the assignment of the item and its weight to the box and remove the item 
from the list. Return to STEP 2. 

STEP 5: List the set of boxes in descending order of remaining capacity. Let m^, be the box 
number for which minimum excess capacity remains. Considering boxes one and two 
in the list only, attempt a one-for-one exchange of items between boxes such that the 
available space in one of the boxes is fully utilized. If such an exchange can be made 
amongst all possible one-for-one exchanges, record it and return to STEP 2. If no 
such exchange is possible, attempt two-for-one, then one-for-two exchanges, box one 
to box two. If such an exchange is possible, record it and go to STEP 2; otherwise, 
go to STEP 6. 

1 STEP 6: Repeat STEP 5 for boxes one and three, one and four, etc. to one and m,,, then two 
and three, two and four, etc. to two and ntg, and so on. If a satisfactory exchange is 
still not evident, terminate; the heuristic procedure does not yield a feasible solution 
to the original problem. 



It is important to note that the above exchange routine will always produce a feasible solu- 
tion to (P2) and if every item in S is successfully placed in boxes an optimal solution too. Of 
course when some items in S cannot be assigned to boxes, the solution may still be optimal, 
but unproven. Furthermore, for (PI) the exchange routine can be modified to always yield a 
feasible solution. The modification is that in STEP 6 when it seems not all items can be 
assigned to the boxes in T, T may be expanded to include a box not originally belonging to T. 
Again in such a case an optimal solution to the original problem (PI) may still be found but we 
cannot prove its optimality. 

IV. COMPUTATIONAL EXPERIENCE AND CONCLUSIONS 

The procedure as described has been programmed in Fortran V code and run on a 
UNIVAC 1110. While (PI) and (P2) represent different problem situations, the solution pro- 
cedure we use for each is essentially the same, and the results obtained for one of the problems 
using our procedure would be expected to reflect closely the procedure's effectiveness in solv- 
ing the other. For this reason, computational results for solving (P2) only will be presented. 
The single knapsack routine we use for solving the surrogate relaxation is adopted from Pro- 
gram /3 of Ahrens and Finke [1]. 



The brackets [ ] denote the greatest integer less than or equal to the enclosed quantity. 



648 



J.C. FISK & M.S. HUNG 



A series of 100 problems consisting of up to 1000 objects and up to six boxes were 
obtained by generating values and weights independently from a uniform distribution in the 
interval [10, 100]. Box capacities were then generated in a similar manner except the interval 
bi < bj ^ ^„ was used where bi = [.4 (Iw,/rt)],' b^ = [.6 (Iw,/«)]. The final box capacity 
generated, b„, was chosen such that occupancy ratio = l.bi/'Lwi = .5. If bj < min w, or 

max bi < max w,, the set of generated box capacities were discarded and a new set generated. 

The occupancy ratio of .5 was used for all problems attempted. 

Table 1 indicates computation times for our algorithm when solving (P2). For each 
item/box combination a total of ten problems were attempted, and the total number of optimal 
solutions to (P2) using the exchange routine was recorded along with solution times. For all 
problems except one, the optimal solution was obtained upon the first application of the 
exchange routine. For the remaining problem, the exchange routine was rerun but objects were 
assigned to boxes according to a (new) random order, at which time an optimal solution was 
found. As an indication of the relative efficiency of the procedure, the algorithm of Hung and 
Fisk [10] using a Lagrangian relaxation solved, after 250 seconds CPU time (UNIVAC 1108), 
only four problems in a ten-problem set containing 200 items and four boxes. The procedure 
presented here was able to solve all ten of these problems in 4.6 seconds. 

TABLE 1. Computational Results for (P2) 





Four Boxes 


Six Boxes 


Number 

of 

Items 


Solution 
Time* 


Proven 
Optimal 
Solutions 


Solution 
Time 


Proven 

Optimal 

Solutions** 


50 
100 
200 
500 
1000 


.03/.02/.05 

.09/.05/.14 

.35/.11/.46 

2.0/.6/2.6 

7.9/2.1/10.1 


10 
10 
10 
10 
10 


.03/.03/.06 
.09/.05/.14 
.35/.13/.48 
2.0/.6/2.6 
7.9/2.1/10.1 


9 

10 
10 
10 
10 



•Average CPU seconds per problem for surrogate relaxation/exchange 

routine/total 

**For the one problem not terminating optimally, a feasible solution was found 
whose solution value was within 1.5% of the surrogate bound 



As can be seen from Table 1, the computational efficiency of the heuristic is less sensitive 
to the number of objects in the knapsack algorithm than are other algorithms [4], [10], [11], 
[12], and [13]. Furthermore, the exchange routine is much less sensitive to the number of 
boxes than are the other algorithms. As a further example of this, a set of ten problems gen- 
erated as in Table 1 but having 1000 objects and 10 boxes was solved in about the same 
number of seconds total CPU time as required by the problem set having 1000 items and 6 
boxes. All problems were again proven optimal. In eff"ect, the overall efficiency of the heuris- 
tic is equivalent to that of a knapsack algorithm. 

As a final test of our procedure, we chose to solve a series of problem sets containing 200 
items and four boxes, generated as in Table 1 but having narrower ranges of item weights and 



The brackets ( 1 denote the greatest integer less than or equal to the enclosed quantity. 



SOLVING LARGE LOADING PROBLEMS 



649 



box sizes. The results are summarized in Table 2. A total of ten problems were attempted for 
each set, and the number of problems for each range of item weights and box sizes which ter- 
minated optimally are specified. 

Table 2 indicates that the exchange procedure remains eflFective as long as some variation 
exists amongst item weights. The amount of variation in box sizes seems to have little effect 
upon the ability of the procedure to terminate optimally. That all ten problems terminate 
optimally for the problem set in which no variability is allowed within either item weights or 
box sizes is apparently fortuitous. For those problems in Table 2 which did not terminate 
optimally, feasible solutions were obtained having solution values which were in every case 
within 2.1% of the surrogate bound. 

TABLE 2. Effect of Variation in Box and Item Weights 
Upon Ability to Terminate Optimally 



Range of 

Item 
Weights 


Range of Box Sizes 




Ab, ^ bj ^ .6b,* 


A5b, < bj < .55b, 


bj = .5b, 


10-100 


10** 


10 


10 


25-85 


10 


10 


10 


40-70 


10 


10 


10 


55 








10 



*b, = Ih;,/4 

**Number of problems terminating optimally 

As seen from Tables 1 and 2, the exchange routine described here appears to be quite 
effective in obtaining provably optimal solutions to loading problems of the type (PI) or (P2) 
in which a number of items and boxes are large and at least some variation exists in item 
weights. For smaller problems, and for problems in which little or no variation in item weights 
exists, available optimizing procedures may be more appropriate. 

REFERENCES 



i, 



[1] Ahrens, J.H. and G. Finke, "Merging and Sorting Applied to the 0-1 Knapsack Problem," 

Operations Research, Vol. 23, pp. 1099-1109 (1975). 
[2] Balas, £., "An Additive Algorithm for Solving Linear Programs with Zero-One Variables," 

Operations Research, Vol. 13, pp. 517-546 (1965). 
[3] Balas, E., "Discrete Programming by the Filter Method," Operations Research, Vol. 19, pp. 

915-957 (1967). 
[4] Eilon, S. and N. Christofides, "The Loading Problem," Management Science, Vol. 17, pp. 

259-268 (1971). 
[5] Fisk, J., "An Initial Bounding Procedure for Use with 0-1 Single Knapsack Algorithms," 

Opsearch, Vol. 14, pp. 88-98 (1977). 
[6] Geoffrion, A., "An Improved Implicit Enumeration Approach for Integer Programming," 

Operations Research, Vol. 17, pp. 437-454 (1969). 
![7] Glover, F., "Surrogate Constraints," Operations Research, Vol. 16, pp. 741-749 (1968). 
[8] Glover, F., "Surrogate Constraint Duality in Mathematical Programming," Operations 

Research, Vol. 23, pp. 434-451 (1975). 
[9] Glover, F., "Heuristics for Integer Programming Using Surrogate Constraints," Decision 

Sciences, Vol. 8, No. 1, pp. 156-166 (1977). 



650 J.C. FISK & M.S. HUNG 

[10] Hung, M.S. and J.R. Brown, "An Algorithm for a Class of Loading Problems," Naval 

Research Logistics Quarterly, Vol. 25, pp. 289-297 (1978). 
[11] Hung, M.S. and J.C. Fisk, "An Algorithm for 0-1 Multiple Knapsack Problems," Naval 

Research Logistics Quarterly, Vol.25, pp. 571-579 (1978). 
[12] Ingargiola, G. and J. Korsh, "An Algorithm for the Solution of 0-1 Loading Problems," 

Operations Research, Vol. 23, pp. 1110-1119 (1975). 
[13] Martello, S. and P. Toth, "Solution of Zero-One Multiple Knapsack Problems," presented 

at the ORSA/TIMS National Meeting, Atlanta (1977). 



SEARCH FOR AN INTELLIGENT EVADER 653 

are applicable only when certain relationships obtain between the detection probabilities for the 
various regions. 

In Section 3, a continuous-time version of the problem is described for which it transpires 
that P* = Pg. This continuous-time version is asymptotically equivalent to the discrete-time 
version as the detection probabilities tend to zero. 

Section 4 summarizes our investigation of the relationship between P* and Pg for different 
values of N and of the detection probabilities. P^ is a particularly good approximation to P* 
when the 9, (/ = 1, 2, .. , A^) are either all sufficiently large, or all sufficiently small, or not 
too dissimilar. Furthermore, if the range of the 9, values is held more or less constant, the 
accuracy of the approximation does not vary greatly with A^. The case of N = 2 may as a result 
be used as a point of reference, and a method of doing this is described. 

Finally, in Section 5 we assess the A'-region problem as viewed by the other player, the 
searcher. Although it transpires that the searcher's optimal strategy is likely to be difficult to 
determine exactly, this section shows that satisfactory approximations to it are usually determin- 
able without too much difficulty. 

2. THE DETERMINATION OF P' AND V(P*) 

This section describes three distinct approaches to the evaluation of P* and ViP*). The 
first is quite general and it may be used for any problem of A'^ regions, where A'^ is limited only 
by the capacity of one's computational facilities. There are no restrictions on the values of the 
escape probabilities. The accuracy of P* and ViP*) which this approach yields can be made 
precise to any arbitrary degree. 

The second method applies to problems which are multiples of smaller problems whose 
characteristics are already known. For example the problem 

(r\, r^, r^, r^, r^, r^,) 

^'' is related to the smaller problem {r^, r^) if ri, ri, r^ are all equal to r^, and r^, r^, r^ are all 
equal to r^: that is, the former problem consists simply of three blocks of the latter. So if P* 
and V{P*) have been established for the smaller problem, this approach indicates how these 
same characteristics may be obtained for the larger problem. 

The third approach is applicable to problems of any A^ where the escape probabilities 
assume only two values and where one escape probability is an integer power of the other. 
These conditions lead to particularly simple expressions for f* and V{P*), and these expres- 
sions yield useful bounds for ViP*) in situations where such conditions are not satisfied. 

(i) A general method 

We begin by outlining a procedure for finding the expected payoff ViP) at any vector P, 
by assuming that the searcher always plays optimally; that is, he consistently searches that 
region with the greatest current p,^,. To simplify the exposition, assume that if / ^ J, then 
PiQi ^ PjQj, both for the original vector P and for the vectors into which it is transformed by 
(1). 

Suppose then, without loss of generality, that P is such that 

P\Q\ > P2Q2 > •••• > PnQn- 



654 



J.C. GITTENS & DM. ROBERTS 



Let biji^ be the number of searches in region / before the /c-th search of region j. From (2), 
assuming the searcher's policy is as specified above, we can write 



PJi 



(*...-i) 



A:-l 



Q, > PjrJ Qj. 



and 



Suppose X is such that 



Hence 



and 



We can therefore write 



*-! 



P,r,' Q, < PjrJ Qj. 



P.rf q, = Pjff ' qj. 



X = [\og(,pjqjl p,q^ + (/c - 1) log /-yl/log r, 
bijk - 1 < X < bijk- 
log (Pjqj/p,q,) + (k - 1) log r. 



l>Uk - 1 = Int 



log r, 



(3) 



= Int 



log {p,qilp,q) n, 

- + Kk - I) — 



log r, 



where Int denotes the integer part, and we take r, and /-y to be related by r, ' = r, ', where «, and 
tij are any numbers, not necessarily integer. 

Next we define 

Vji = E(no of searches in region j\ evader in region /). 
V, = £ (total no of searches | evader in region /) 

N 
7 = 1 

From the definition of q,, V„ = l/q,. Otherwise, we can use the definition of bij^ to write 
expressions for Vij in the following two situations: 

(a) / < j 

Kj = b,ji + rj{b,j2 - b,j^) + rj- (b,j3 - b,j2) + r/ (b,j4 - b,ji) + . . . 
= qjib,ji + rjb,j2 + r/ b,p + ...). 
To a first approximation, using (3), 



b,k = ^,1 + ik-\) — . 



So therefore 



V, = b,, + qj-^ £ (/c - 1) r} 



k-\ 



•J k=2 






THE SEARCH FOR AN INTELLIGENT EVADER 
CONCEALED IN ONE OF AN ARBITRARY NUMBER OF REGIONS 

J.C. Gittins 

University Mathematical Institute 
Oxford, England 

D.M. Roberts 

Ministry of Defence 
London, England 

ABSTRACT 

This paper considers the search for an evader concealed in one of an 
arbitrary number of regions, each of which is characterized by its detection pro- 
bability. We shall be concerned here with the double-sided problem in which 
the evader chooses this probability secretly, although he may not subse- 
quently move; his aim is to maximize the expected time to detection, while the 
searcher attempts to minimize it. 

The situation where two regions are involved has been studied previously 
and reported on recently. This paper represents a continuation of this analysis. 
It is normally true that as the number of regions increases, optimal strategies 
for both searcher and evader are progressively more difficult to determine pre- 
cisely. However it will be shown that, generally, satisfactory approximations to 
each are almost as easily derived as in the two region problem, and that the ac- 
curacy of such approximations is essentially independent of the number of re- 
gions. This means that so far as the evader is concerned, characteristics of the 
two-region problem may be used to assess the accuracy of such approximate 
strategies for problems of more than two regions. 



. INTRODUCTION 

In a recent paper — Roberts and Gittins [5], hereinafter referred to as R & G — an 
malysis was given of a search problem involving two regions. The analysis is extended in this 
taper to problems of similar type but with an arbitrary number of regions. Such problems may 
te described as follows: 

Suppose a stationary object is hidden in one of A^ distinct regions. The probability of its 
leing concealed in region / (/ = 1, 2, . . , A^) will be denoted by /?,, and the location probabil- 
ty vector by 

P = iP\, Pi. ■■ . Pn)- 

'ach region is characterized by its detection probability q, which is the probability that a search 
>f region / will discover the object if it is there; to avoid unnecessary complications, suppose 

< q, <l. 
651 



652 J.C. GITTENS & DM. ROBERTS 

Often it will be advantageous to specify a region in terms of its escape probability r,, where 
/•, = 1 — q,. We assume that the time taken to search any region is constant, and take this con- 
stant to be the unit of time. 

From Bayes' theorem it follows that an unsuccessful search of region j changes the loca- 
tion probability vector as shown below. 

(1) 'Pj^^lZ^ 

"i 1 ' 

1 - PjQj 

J Pi 
Pi — -; for all / J^ j. 

It has been shown by, among others, Black [1] that the strategy which at any time searches the 
region with the greatest current value of p,*?, minimizes the expected time until the object is 
found. For a given initial F this minimum is denoted V{P). Usually such a strategy is deter- 
ministically defined, and as such can be considered as pure. However there are occasions when 
PiQi is maximized for more than one value of /. Clearly in these circumstances the searcher can i 
choose between pure strategies, each of which will lead to the minimum expected time to 
detection. We shall adhere to the terminology used in Norris [4] by referring to these pure 
strategies as 'good' strategies. 

We can extend equation (1) to determine that the transformation due to a sequence of ' 
searches which involves a total of [k] searches of region /, for each /, will be as follows: 

"■' ""' 

0) fl ^ - — -• 

7=1 

Significantly, the order of the sequence has no effect on the final transformation. 

In R & G this single-sided problem was considered in the form of a double-sided search 
problem in which the initial value of P is chosen secretly by the object (or evader). In this 
form the problem is a zero-sum game between the searcher and the evader, the payoff to the 
evader being the time which the searcher takes to find him. This game has been considered by 
Bram [2] who showed that it does possess a value, and that therefore the appropriate strategies 
for the searcher and the evader are the minimax and maximin strategies respectively. 

It is easy to show that if the evader is allowed to move quite freely between searches then 
his maximum strategy is given at each stage by the location probability vector Pg, defined such 
that p,q, is the same for all /. In R & G it was shown that when TV = 2, Pg is a remarkably 
good approximation to the evader's maximum strategy P* for the more complicated, and often 
more realistic, problem in which the evader, once hidden, remains stationary. This observation 
also led to nearly optimal search strategies. The significance of these approximations is that 
they are much more easily calculated than the exact solutions. In this paper we show that for 
arbitrary values of N the approximation P* = Pg remains extremely good under most cir- 
cumstances, and this is our most important conclusion. 

Detailed methods of calculating P* to any desired accuracy are given in Section 2. One ol 
these can be used for any set of detection probabilities. The others are abridged methods which 



SEARCH FOR AN INTELLIGENT EVADER 



655 



Similarly, to a second approximation, 



b,, = b,n+{k-2) — , 



b,n + — n, 



Qin 



j"j 



Kj = Qjb,j} + rj 
So for the /-th approximation 

^U = QjbijX + (ljrjb,j2 + qjrf b,ji + . . + rj-^ 



bill + w, 



9y«y 



(b) / > j 



v.. = (r/' -r/^) X 1 + (r/^ - r'j"') x 2 + (r/^ - r^'^) x 3 



= rj"^ + r/^ + r/^ + . . . 



To a first approximation, again using (3), 



F„ = /•/" (1 + /•, + r,2 + . . . ) 



"yvi 



9/ 



Similarly, to a second approximation, 



F,^ = r/'' + /-/'^ (1 + r, + r,^ + 



= rf'" + ^ 



ji2 



And for the /-th approximation 



Qi 



V ■ = r"' + r-^'^ + + — — 

'^ ij 'j ^ 'j n ' 

Hi 



Thus we have established a series of successive approximations for F,(P), depending on 
the value of / up to which 6^/ is given its exact integer value. Since 

(4) v(P) = YPi^i(P) 

i 

we can thus calculate V{P) to any desired accuracy. 

In an examination of problems of various sizes, it was found that / = 10 gave very good 
results indeed — typically resulting in a V(P) accurate to within ± 10""* of its true value. If 
/ = 20 is used, then understandably ViP) is much closer to its true value; it was always 
observed to be within ± 10~^ and frequently far closer. 

Having found a precise method of calculating ViP) it is necessary next to find an 
approach to evaluating P'and hence V(P*). Since Pg is always easy to calculate, and is always 
.close to P* if not actually equal to it, an obvious approach is to proceed as follows. Starting 



656 J.C. GITTENS & DM. ROBERTS 

from the vector P^, and having calculated V{Pg), form a new vector P by increasing its first 
component by, for example, 0.01, and decreasing each of its remaining components by an 
amount proportional to the component's magnitude. If V{P) > ViPj form a new vector by 
modifying P as Pg was modified. Continue thus until a maximum is reached. If 
V(P) < V{Pg), search in the opposite direction. When a maximum is reached, at P' say, (or 
if no payoff greater than ViPj is found in either direction) move from there (P' or Pg) by 
changing the vector's second component, and so on. When all directions have been examined, 
the entire procedure can be repeated using progressively smaller increments until an acceptable 
P* is found. 

There are two reasons why this procedure invariably leads to an acceptable P*. First, 
using an obvious extension to the argument in Section 2 of R & G, it may be shown that V(P) 
is continuous and a concave function of P ; hence, the problem of a local maximum does not 
arise. Second, r, < 1 for all /, and this means that P* is an interior maximum point; that is, 
each of its components is positive. 

In the illustrative examples of Sections 4 and 5 the procedure was continued to the extent 
of using increments of 0.0001. When this sensitivity was combined with a value of / =■■ 30 in 
determining V{P) one arrived at a value of ViP*) which, whenever checked against known 
V{P*) (determinable as described in Sections 2(ii) or 2(iii)), was accurate to within 0.00004. 

This procedure can prove to be inefficient if several regions have the same escape proba- 
bility. In such cases it seems advisable to increment all the components of P associated with 
these particular regions together. This keeps such components equal, not an unreasonable pol- 
icy when one considers that at P* they will of course be equal. It may be advisable to observe 
similar precautions also when several regions have approximately equal escape probabilities. 

When the incidence of regions with the same escape probability is such that the search 
problem may be regarded as the result of the joining together of a set of sub-problems, this 
modification to our general procedure may be linked with a simplified method of calculating 
V(P). This situation will be discussed in Section 2(ii). 

(ii) Problems with identical blocks of regions 

Consider a problem for which there is an integral number k of blocks of regions, where 
the blocks (each of M regions, say) are identical in all respects. The location probability vector 
P^ for such a problem may be expressed, with an obvious notation, in the form 

P+= (a/: i = \. 2. . . , M-J = \. 2, . . . k). 

The expected payoff if the evader uses the strategy P^ and the searcher plays so as to minimize 
the expected time to detection will be denoted by V,,xf{P'^). We shall use V\f(P) to designate 
the similar function for the reduced problem in which the evader is restricted to one particular 
block of M regions and plays the vector P = {p,; I = I, 2, . . , M). We shall say that P^ 
corresponds to f* if 

p,t = P,lk, i = \. 2, . . , M\ J = \. 2, . . . k. 

In such cases the calculation of V/cM^f*^^ 'S simplified by the following result. 

THEOREM 1: If i'+ corresponds to Pthen 

V,^iP^) = k 



Vm(P) - y 



•-i 



SEARCH FOR AN INTELLIGENT EVADER 



657 



PROOF: The expected number of searches when the evader is hidden in the j th block is 
kVxfiP) - ik -j). The probability that the evader is actually hidden in the 7th block is 
clearly \/k. Hence 

Vm(P^)-'L\[kVjP)-ik-j)} 



= k 



Vm(P) - T 



•4 



.*+ 



COROLLARY: P "^ corresponds to P* and 



'miP*-") = k 



Vm(P*) - 



'-'k 



Here P*^ and P* are the evader's maximum strategies for a /c-block problem and the related 
single-block problem respectively. The proof is immediate from Thfeorem 1 and from the 
observation that p*^j must be independent of j for all /. It follows that the values of P* and 
ViP*) for any M-region problem may be used to calculate similar quantities for problems 
defined by adding identical blocks of M regions. 

(iii) Problems with just two escape probabilities, one a power of the other 

Consider a problem with regions (1, 1), (1, 2), . . , (1, A), (2, 1), (2, 2), . . , (2, B), 
with 

Qu = Qn= ■ ■ = Q\A = Qi = '^ - r\, 

Q2\ = 922 = • • = QlB = <?2 = 1 = ''2. 

and with r" = r2, where n is an integer. 

The components of Pg are 

P\, = Q2lUqi + Bq^), i = \, 2, . . , A, 

Pi, = qx/(Aq2 + BqO, / = 1, 2, ... 5. 

For a problem of this type, the infinite series which, as described in Section 2(i), arise in the 
calculation of ViPj may be summed explicitly, leading to the expression 



ViPo) = 



^A(A+l)q2 + jB{B+l)qi + ABq2 + A^r^ 



nr" 

q\ 



+ {An + B) — . 

qi 



Aq2 + Bqi 
Clearly the point P* must be such that 

P\i = P\j' 1 < iJ < ^. 
and 

P2, = Pij^ 1 < iJ < B. 

]!losed expressions may also be obtained for V{P) for other vectors P with these properties. 
Consider for example the vector P' with components 

p[, = q2/iAq2 + Bq^rx), i = I. 2, . . , A, 

Pii = q\r\liAq2 + Bq^rx), i = I, 2, . . , B. 



658 JC GITTENS & DM. ROBERTS 

Thus the strategy for the searcher which minimizes the expected time to detection 
corresponding to the evader's having chosen to play P' starts with searches of each of the first 
A regions, and it follows from equation (2) that this transforms the vector P' into the vector 
Po- We have 

V{P') = Kevader is found in one of the first A searches) x 

£(no. of searches required | evader is found in one of the first A searches) + 

/•(evader is not found in one of the first A searches) x 

^(no. of searches required | evader is not found in one of the first A searches) 

= M\Q\ X y (^ + 1) + (^PiVi + Bp2{) X {V{P„) + A). 

Similar closed expressions may be obtained for V{P) for all those vectors P which have 
the symmetries stated above, and which may be transformed into the vector Pg. These are the 
values of P which are of most interest, since P* must be one of them. This may be shown 
along the lines described in R & G for the case when N = 2. 

3. THE SEARCH PROBLEM IN CONTINUOUS TIME 

The N regions are now characterized by their detection rates X, (/ = 1, 2, . . , A^). As 
before the evader may not move, but the location probability vector changes by Bayes' theorem 
as the search progresses. Its value at time / is denoted by {p\it), PiU) , ■ ■ , Pi^it)) and Pg is 
the vector for which Pik, is a constant. 

In continuous time it is natural to allow the searcher to divide his effort at any given time 
between the A^ regions. If w,(r), a piecewise continuousf function of /, is the proportion of his 
total effort allocated to region / at time t [ui(t) + UjU) + . . + «/v(/) = 1], the probability of 
the evader being detected in the time interval (/, t + S/) if he is in region /, and conditional on 
detection not occurring before time /, is 

^^^ \,u,{t) bt + o{.ht), 

except at a point of discontinuity of the function w,(r). The probability of detection in 
(/, t + 8/) if the location of the evader is given by the initial probability vector, and conditional 
on detection not having occurred before time /, is therefore 

^^^ X\,w,(?)A(r)8r + o(8r); 

(=1 

and for the sake of convenience, we shall write this in the form p{t)8t + oibt). The expres- 
sions (5) and (6) lead respectively, and by elementary arguments, to alternative expressions for 
y(/), the probability that detection does not take place before time t. We have 

^^^ ?{t) = 2:^(0) exp[-\,U,U)] =exp[- J^'pCs)^^], 

(=1 

where 

U,{t) = f'u,is) ds. 

*'0 

Also if ris the time to detection it is well known, and easily shown by integration by parts, that 
(8) E{T) = C ^U) dt. 



tit is not difficult to modify the argument so that u, (r) is required only to be a measurable function of t. 



J.C. GITTENS & DM ROBERTS 659 

As for the discrete-time case, the evader's strategy is 

P = (Pi(0). P2(0), . .. /^/vCO)). 

The searcher's strategy is the vector function u which for any t > takes the value 
(«i(r), U2U), . . , U/v^(/)). We shall denote the evader's maximum strategy by P* and proceed 
now to prove - 

THEOREM 2: For the continuous time problem P* = P„. 

PROOF: We note firstly that it follows from equations (7) and (8) that 

Em = JJ^ £ a(0) exp[-X, U,(t)] dt. 

i=\ 

It may be shown that this expression is minimized if and only if the searcher uses the continu- 
ous time version of the rule that he should search the region with the greatest current p,qi\ 
specifically — 

M,(r) > ^#> X, p,U) = max \jPjit) 

i 

except possibly at a set of times of Lebesgue measure zero. This result follows from a varia- 
tional argument similar to that used by Gittins [3] for another resource allocation problem. 
Alternatively it may be established using the concept of uniform optimality discussed by Stone 
[6]. 

Under such a rule there is clearly some tx{< °o) such that 

r, = inf [t: XiPiit) = \2P2(t) = ■ • = >^r^pM}, 
and 

/ ^ 
(9) P,U) = \-y X k-\ i = 1. 2. ... N- t ^ r,. 

From Bayes' theorem we have 

/ ^ 
P,{t) = p,iO) exp[-\, U,it)] / £p/0) exp[-X, UjU)]. 

I 7=1 

/ = 1, 2, . . , yV; r ^ 0, 

so that 

XiPi(O) exp[-Xi Uxit)] = X2/72(0) exp[-X2 U2U)] = . . . 

^^^^ =X;vPyv(0) exp[-Xyv t//v(0], t ^ r,. 

Now diff"erentiating equation (10) with respect to /, and dividing by (10), gives 

u,{t) = Xf' / £ X-', i = \.2, . . , N\t ^ /,. 

Thus from (6), 

1 



■r pit) = 



N 






. t ^ t 



i 



660 SEARCH FOR AN INTELLIGENT EVADER 

A similar argument shows that 



"« p«) > 



N 



A 



EV 



-1 



t < t. 



From equations (7), (8), (11), (12) it follows that if the searcher uses a continuous time ver- 
sion of the principle that in order to minimize the expected searching time, he should search 
that region with the greatest current /j,^,, then 

E{T) ^ £\-', 

7 = 1 

with equality if and only if fj =0. Now /i = if and only if p,(0)X, is a constant; that is to say 
if and only if P = Pg. Thus the evader's maximum strategy is P^ and the theorem is proved. 

4. STRATEGIES FOR THE EVADER 

The Two-Region Problem and Related, Larger Problems 

From an examination of even a few A'-region problems, one very soon concludes that 
several of their characteristics depend on the value of A^. Any description is therefore perhaps 
best undertaken in terms of increasing A'. And we shall begin with the simple two region prob- 
lem, /"i = 0.8, ri = 0.512. (This problem has, as it happens, been chosen such that 

r\ = r2 



but, as we shall see later, this has not in any way affected the generality of the conclusions). 

For this particular problem there are no fewer than three ways of finding V{Pg)^ V{P*) 
and P*. (Pg of course is always simply determined from PiQ, being a constant). First we note 
that the method of Section 2(iii) may conveniently be followed. Second, the general approach 
of Section 2(i) can be used, although usually we cannot expect to arrive at an exact ViPg) 
(though in this particular case, we do, fortuitously) nor an exact P* and V{P*). Third, as in 
very many two-region problems (see R & G) a dynamic programming method, that is first 
described in Norris [4], is entirely feasible. 

So for this problem, we have 

ViPg) =6.51067 , 
V(P*) = 6.53006 , 
V(P*) 



ViPo) 



= 1.00298 



If in our examination of the effects of increasing A^ we see this as a starting point, then 
clearly we can look at a whole range of four-region problems, all related to this basic two-region 
problem in the sense that the largest and the smallest of the four escape probabilities are 0.8 
and 0.512. It is often convenient to classify A'-region problems in terms of the largest and 
smallest escape probabilities; we shall refer to these as r/ and r^ respectively. 



SEARCH FOR AN INTELLIGENT EVADER 



661 



Four of these problems are listed in Table 1 . 

TABLE 1. 



EXAMPLE 


r\ 


ri 


''3 


r^ 


V{Po) 


V(P*) 


V(P*)/V(Po) 


1.1 


0.8 


0.8 


0.8 


0.512 


15.501 


15.526 


1.00160 


1.2 


0.8 


0.8 


0.512 


0.512 


12.521 


12.560 


1.00310 


1.3 


0.8 


0.512 


0.512 


0.512 


9.574 


9.609 


1.00362 


1.4 


0.8 


0.7 


0.6 


0.512 


11.312 


11.327 


1.00127 



Three features are immediately evident from the table: 

(i) The largest V(P*)/ViPg) exceeds the corresponding ratio in the two-region prob- 
lem. 

(ii) This occurs in a problem where only the largest and the smallest escape probabili- 
ties are present — example 1.3 (problems where r2, /"s are distributed between /■] and r^ 
have ViP*)/ViPg) ratios which are very much less — see for example 1.4). 

(iii) Example 1.2 consists of two blocks of our original two-region problem. For this 
example V(P*) and ViPg) — as well as of course P* — could have been derived from 
the corresponding characteristics of the two-region problem using Theorem 1. 

These features continue to manifest themselves as A^ increases; to eight for instance, as in 
Table 2. (Unlike Table 1, this table does not contain all the frequencies of /•/ and r^; however 
the example for which V{P*)/V(Pg) takes its maximum value has been included). 

TABLE 2. 



EXAMPLE 


''1 


''2 

r6 


''3 




iVPg) 


(VP*) 


ViP*) 
ViPo) 


2.1 


0.8 
0.8 


0.8 
0.8 


0.8 
0.8 


0.8 
0.512 


33.498 


33.525 


1.00081 


2.2 


0.8 
0.8 


0.8 
0.8 


0.8 
0.512 


0.8 
0.512 


30.503 


30.553 


1.00163 


2.3 


0.8 
0.512 


0.8 
0.512 


0.8 
0.512 


0.8 
0.512 


24.543 


24.620 


1.00316 


2.4 


0.8 
0.512 


0.8 
0.512 


0.512 
0.512 


0.512 
0.512 


18.649 


18.718 


1.00372 


2.5 


0.8 
0.65 


0.75 
0.6 


0.7 
0.55 


0.65 
0.512 


21.188 


21.198 


1.00047 



Specifically, the largest ViP*)/ViPg) ratio continues to exceed the corresponding ratio in 
the two-region problem; also it has increased slightly. Moreover, it occurs as previously in a 
problem where the only escape probabilities present are either r/ or r^ — example 2.4 of Table 
2. Once again we find some eight-region problems corresponding to four-region problems, or 
even, in one case, to our basic two-region problem (examples 2.4, 2.3, 2.2 to 1.3, 1.2, 1.1 
respectively; example 2.3 is also contiposed of four blocks of the two-region problem). 



I 



662 J.C. GITTENS & DM, ROBERTS 

If we look once again at example 2.4 we will see that it consists of two identical blocks 
each of four regions, where each block is that of example 1.3. If now we go on to consider the 
12-region problem 

r\ = r2 = r^ = 0.8, 

(A) 

r4 = ^5 = . . = /'12 = 0.512, 

then using Theorem 1 (noting that this problem consists of three identical blocks) we can 
determine that 

V(P*) 

Significantly as the number of blocks increases still further this ratio will certainly 
increase, but will never exceed 1.00382. Likewise from our basic two-region problem we can 
infer that for all problems of jV regions, where half of the regions have escaped probabilities 
0.8, and the other half 0.512, V{P*)IV{P„) is bounded above by 1.00322. 

From Tables 1 and 2 we note that those examples with the largest ratios ViP*)/V(Pg) 
both have three times as many regions with escape probability 0.512 as with 0.8 (examples 1.3, 
2.4). This seems to be fortuitous. For in the 12-region problem the largest V{P*)lV{Pg) ratio 
occurs in the example 

/■] = r2 = /"j = ^4 = 0.8, 
(B) 

r,= . . . = r,2 = 0.512, 
although its value (1.00378) exceeds only slightly that of example (A). 

The N-Region Problem — General Conclusions 

Were we to have started with any other pair of escape probabilities as our basic two-region 
example, and proceeded to examine associated problems up to 12 regions, then almost certainly 
we could have observed features similar to (i), (ii) and (iii) above. In the course of this study 
such a procedure was carried out about a dozen times on a set of problems intended to give as 
representative a picture as possible. In each case this observation was found to apply, and, as 
might be expected, such an analysis enables one to be more explicit in one's conclusions. 

Specifically, the largest [V{P*) — V{Po)]lV{Pg) increases with increasing N. It exceeds 
that of the two-region problem, but is generally less than twice as large. Exceptions do exist, 
although they seem to be confined to problems where one (and sometimes, though rarely, two) 
escape probabilities are large (>0.85 say) and all the rest are by comparison quite small (about 
0.1 or less); for example in the 12-region problem 

ri = 0.95, r2= . . = rn = 0.1285, 
V{P*) is about 4'/2% greater than V{Pg), whereas in the related two-region problem 






/-I = 0.95, /-2 = 0.1285, 
ViP*) is about 1% greater than V{P„). 



SEARCH FOR AN INTELLIGENT EVADER 663 

In two-region problems where V{P*) and ViPj are equal, one finds that in larger, related 
problems, the same generally applies. Although this was not found to be invariably so, such 
diflFerences as were observed were always of the order of 0.2% or less. 

We are therefore able to say something about the ratio of V{P*) to ViPo) in A'-region 
problems (where N takes values up to 12 at least) provided we possess certain information 
about the related two-region problem. The sort of information required is shown in Figure 1. 
We can see there that the two-region problem 

r, = 0.8, /-2 = 0.2 

possesses a ViP*)/ViPg) ratio of about 1.015. What can we say then about the 12-region 
problem 

ri = 0.8 > 7-2 > . . . > r,, > /•i2 = 0.2 ? 

First, if r2, .... /"n are distributed at all within this interval, then the ratio of ViP*) to ViPo) 
is almost certain to be very much closer to one. Second, all such 12-region problems — even 
those consisting solely of the two extreme escape probabilities r/ and r, — will have a ViP*) 
within 3% of V{Po)\ as indicated above such exceptions to this conclusion as do exist tend to 
be well defined, and this is certainly not one of them. 

In any class of problems of the same size N, and characterized by the same /■/ and r^, the 
problem with the largest proportional difference between V{P*) and V(P„) will be one of those 
N — \ problems whose escape probabilities are r/ and r^ only; for instance example 1.3 of Table 
1. What justification is there for proposing this conjecture? First, considerations of symmetry 
assure us that for any A^-region problem with all regions having the same escape probability, P„ 
and P* are the same. (One could call this a perfectly balanced problem). Second, where 
several escape probabilities are involved and where /•/ and r^ are respectively the largest and the 
smallest of them, it can be seen that a "less balanced" problem can be created (i.e., one moves 
further away than one already was from the "all escape probabilities equal" situation) by moving 
some or all of those not equal to r/ or r^ to one extreme, with the remainder being moved to 
the other extreme. This conjecture has been examined in a number of cases, and has been 
found always to be correct. 

Special Relationships Existing Between Escape Probabilities 

All of the conclusions discussed so far were described using as a basis examples where the 
largest and smallest escape probabilities (r/ and rj were such that the ratio of the logarithms of 
ri and r^ is an integer. In Table 1 for instance 

inevitably this inclines one to ask whether such a relationship has affected our conclusions in 
any way at all. 

This question has been considered quite extensively, but no evidence has been found to 
suggest that such relationships have any influence. The behaviour of V{P*)/ViPo) as a func- 
tion of the escape probabilities has been explored where no integer relationship exists between 
these probabilities. The function's general characteristics were found to be indistinguishable 
from those observed where integer relationships do exist. 



664 



J.C. GITTENS & DM. ROBERTS 




CONTOURS SHOW THE RATIO 



V{P*) 
V(Po) 



V(P*) 
V{PJ 



< 1.0075 



Figure 1. The two-region problem 



5. STRATEGIES FOR THE SEARCHER 



We have confined our attention so far to an analysis of what the evader should do in par- 
ticular situations. We saw that very often he can be satisfied with merely calculating Pg, know- 
ing that V(Pg) is so close to V(P*) as to be adequate for his purposes. This suggests that the 
searcher might apply one of the good strategies vis a vis Pg. While this would often be a satis- 
factory procedure, such a strategy generally turns out to be less adequate than is Po'foT the 
evader. However the searcher may always proceed via the calculations described in Section 
2(0. 



SEARCH FOR AN INTELLIGENT EVADER 



665 



In making these calculations he will of course determine an arbitrarily precise approxima- 
tion to P* (which for convenience we shall refer to as P^). However he will also have deter- 
mined the components ViiP'^) from which, using equation (4), V(P'^) is derived: viz 

V(P^) = Ia^ KiP^). 
A typical set of components for an eight-region problem is shown in Table 3. 

TABLE 3. 



/■ 


1 


2 


3 


4 


5 


6 


7 


8 


r. 
1 


0.8 


0.75 


0.7 


0.65 


0.65 


0.6 


0.55 


0.512 


V.(P'^) 


21.2907 


21.031 


21.1195 


21.7895 


20.7895 


21.1753 


21.1265 


21.2774 



ViiP'*') is defined in Section 2(i) as the expected payoff assuming that the evader is actually 
hiding in region / ; it is shown as a function of P'^ merely to signify that it corresponds to a 
good strategy on the searcher's part vis a vis an evader strategy P^. This search strategy hap- 
pens to be pure, for V,{P'^) was calculated on that basis. Significantly, at P* itself there will be 
several of these good strategies. The searcher's minimax strategy is one which ensures that the 
expected payoff is not greater than the value of the game, irrespective of what strategy the 
evader has played. It is a mixed strategy obtained by randomizing over those strategies which 
are good against P*. The randomization yields a set of V,{P*) each equal to the value of the 
game ViP*). 

For the example of Table 3, ViP'^) is 21.1985. It is interesting to note by how little any 
of the V,iP'^) differ from V(P+). In particular the largest of them (21.7895) is only 3% greater 
than V{P^), which by the definition of P* is less than or equal to ViP*). This means that if 
the searcher plays the good strategy vis a vis P^, then he limits the evader to a maximum 
expected pay-off of 21.7895, whatever strategy the latter happens to have chosen. 

The situation we have just described where the largest ViiP'^) is only a few percent 
greater than V(P'^) is quite common, and seems invariably to be the rule where the escape pro- 
babilities are well spread out between /•/ and r^ as in Table 3. (This is in contrast to the some- 
what clustered form of examples 2.3 and 2.4 of Table 2). 

Should this situation not apply, then the searcher might be advised to consider points in 
the vicinity of P^ with a view to finding one where it does. Significantly also, having calculated 
the ViiP) for several values of P, there is always the possibility that by randomizing over some 
of the associated pure strategies, he can effectively decrease still further the variation with 
respect to /. The reason for this is that if the searcher plays the good strategy associated with 
the vector Pj with probability njij = \, 2, ... M), then the expected time to detection, given 
that the evader is in region /, is 

M 

I 77, V,(Pj). 

j=\ 

Randomizing over different good strategies associated with the same vector P has a similar 
effect. 

We can see this by referring once again to Table 3. The pure strategy corresponding to 
/*"•" searches region 4 prior to region 5 even though r^ = r^, this accounts for the difference of 



666 J.C. GITTENS & DM. ROBERTS 

one between V^(P'^) and VsiP'^). But we could equally well have chosen to search region 5 
before region 4. By selecting either pure strategy with probability 0.5 the searcher can 
effectively modify Table 3 so that 

V^(P+) = Vs(P^) = 21.2895, 

thereby limiting the evader to a maximum possible payoff (of 21.2907 if the latter is actually 
hiding in region 1) within 0.4% of V(P~^), and of course even closer to V(P*). 

The procedure outlined in Section 2(i) can be divided into two parts. The first described 
how one could find the payoff ViP) corresponding to any evader strategy P; the second sug- 
gested how by beginning with Pg and iterating one could converge towards P*. In discussing 
possible policies for the searcher, we have so far been assuming that he actually follows this 
procedure to the extent of determining a P^ near to P*, and then examines the values of Vi(P) 
at P^ and in its vicinity. However, a large number of iterations is often needed to reach P^. 
This immediately suggests a simplified approach open to the searcher. It is as follows: 

Start with P^ and, using the first part of the procedure of Section 2(i), find ViPg)- Next 
take a sample of vectors (A^say) evenly distributed about Pg, and for each of these find V(P). 
From this sample select a P for which the components V,(P) vary as little as possible. We 
have 

V(P) < V(P*) < max V,(P). 

Clearly if max Vi(P) — V(P) is small, the searcher may feel justified in playing a good strategy 
vis a vis P. 

Otherwise, of course, he can proceed to the evaluation of P^, and then derive his strategy 
as described above. 

REFERENCES 

[1] Black, W.L., "Discrete Sequential Search," Information and Control, Vol 8, pp 159-162 
(1965). 

[2] Bram, J., "A 2-Player jV-Region Search Game," IRM-31, Operations Evaluation Group, 
Center for Naval Analysis, Washington (Jan 1963). 

[3] Gittins, J.C, "Optimal Resource Allocation in Chemical Research," Advances in Applied 
Probability, Vol i, pp 238-270 (1969). 

[4] Norris, R.C., "Studies in Search For a Conscious Evader," MIT Lincoln Laboratory Techni- 
cal Report Number 279 (1962). 

[5] Roberts, D.M. and J.C. Gittins, "The Search for an Intelligent Evader: Strategies for 
Searcher and Evader in the two-region problem," Naval Research Logistics Quarterly, Vol 
25, No. 1 (Mar 1978). 

[6] Stone, L.D., Theory of Optimal Search (Academic Press, New York and London; 1975). 



THE QUEUEING SYSTEM mVg/1 AND ITS RAMIFICATIONS 

M. L. Chaudhry 

Royal Military College of Canada 
Kingston, Ontario, Canada 

ABSTRACT 

This paper deals with the bulk arrival queueing system M /G/1 and its 
ramifications. In the system M /G/1, customers arrive in groups of size X (a 
random variable) by a Poisson process, the service times distribution is general, 
and there is a single server. Although some results for this queueing system 
have appeared in various books, no unified account of these, as is being 
presented here, appears to have been reported so far. The chief objectives of 
the paper are (i) to unify by an elegant procedure the relationships between the 
p.g.f.'s 

oo oo 

P(z) = ^ P„z"and P*{z) = £ P+z" 
«=0 n=0 

where P„ and P„ are the limiting probabilities of queue lengths being n at ran- 
dom and departure epochs respectively, (ii) to correct an error in the paper by 
Krakowski and generalize his results and (iii) to discuss some other interesting 

y 

cases of the system M /G/1 and its special cases. 

INTRODUCTION 

Several authors have discussed one aspect or the other of the queueing M^/G/1 in which 
groups of random size, X, following a Poisson process join the system and are served individu- 
ally by a single server whose service time distribution is general, e.g., Gaver [6], using renewal 
theoretic arguments, discusses, among other things, P{z), the probability generating function 
(p.g.f.) of the limiting distribution of the number in the system at a random point in time; Sah- 
bazov [14], using the imbedded Markov chain technique discusses P'^iz), the p.g.f. of the 
number in the system at a departure epoch and the waiting time distribution for a random cus- 
tomer of an arrival group. The waiting time distribution discussed by Sahbazov [14] and some 
other authors seems to have been incorrectly reported; the correct formulation of which has 
recently been given by Burke [1]. It may be pointed out here that it is possible to derive P'^iz) 
either from the paper of Harris [9], some results of which are reported in section 5.1.10 of 
Gross and Harris [7] or from the works of some other authors. 

In queueing problems, among other things, the main interest, in general, centers around 
getting P„. However to simplify the analysis, Kendall, in his two important papers, proposed 
the use of imbedded Markov chains and the researchers then got P^ or other sijch probabilities 
by appropriately defining the regeneration points. Later, the other researchers in order to get 
P„ first got P^ and then related P^ to P„. However, one can easily get P„ directly in many sin- 
gle server queueing systems with bulk arrival or bulk service or both, and if need be P,^ from 

667 



668 ML. CHAUDHRY 

P„. It is towards this purpose that this paper is chiefly addressed. To do this, we consider the 
queueing system MVG/1. The procedure has been successfully employed by Chaudhry and 
Templeton [2] in getting similar results for the bulk service queueing system M/G^/1 wherein 
the server has a maximum capacity B. 

Gross and Harris [7] relate the probabilities P„ to different types of imbedded Markov 
chain probabilities tt„ using the semi-Markov approach. Their probabilities 7r„ are different 
from the probabilities P;^ (defined more accurately later) in that 7r„'s are considered at depar- 
ture and arrival epochs when the system is empty, whereas P^'s are considered only at depar- 
ture epochs. Thus, clearly the set of epochs of the imbedded chain considered in this paper is a 
subset of the set of epochs of the imbedded chain considered by Gross and Harris [7]. 
Although the probabilities /*„ are given in Gross and Harris, the technique of connecting P„'s 
and Pi^'s is new and elegant in that one does not have to first obtain 7r„ by the imbedded chain 
as defined in Gross and Harris and then use the theory, of semi-Markov processes or renewal 
theoretic arguments to get /*„— a normal practice so far. The method discussed here is new in 
that we reverse the procedure of first getting P^ (or tt„) and then P„, i.e., we first get P(z) 
and then the relation between the p.g.f.'s of P„ and P^ follows immediately and consequently 
the relations between various moments of the underlying distributions. Such relations between 
the moments neither appear to have been explicitly discussed in the literature nor would they 
be easily obtainable even if one tried to use the relation given in Gross and Harris [7] the 
exception to this statement being a recent result reported in Krakowski [11], the details of 
which are given in the next paragraph. It should, perhaps, be emphasized here that though the 
technique of supplementary variable is standard, its full impact is not yet known as will be 
revealed through the results that are being presented in this paper. Besides, the technique is 
more powerful than the other techniques discussed so far, because by using the procedure dis- 
cussed in this paper one can, as mentioned earlier, even get similar results for several other sin- 
gle server queueing systems with bulk service or possibly bulk arrival and bulk service in which 
service time distribution is general. Obtaining results, through other techniques, similar to the 
ones that are being presented here for the latter type of queueing systems would be much more 
difficult, if not impossible, than through the procedure discussed here. 

In a recent paper Krakowski [11] finds the average queue size at three epochs of time- 
random, just before arrival, and just after departure, for several queueing systems. However, 
for the system MVG/1, he obtains using intuitive arguments, the above averages only at an 
arrival epoch or at a random epoch (see scholion to his theorem B or section 4). Unfor- 
tunately, his result (S.23) has been incorrectly reported. Krakowski's [11] equation (S.23) is 
correct, if he was considering groups of constant sizes, but it does not appear to be so. By 
using mathematically more sound arguments, we later give correct expression for Krakowski's 
result (S.23). In fact, by our method, one can not only find the relations between averages, but 
also relations between higher order moments for the underlying distributions. 

It may be appropriate here to mention the other related works which have been discussed 
in the literature. Foster and Perera [5], using renewal theoretic arguments, have discussed rela- 
tions between steady-state probability generating functions (p.g.f.'s) of queue size considered at 
the above three epochs of time for the system GI7M/1 wherein customers following a 
recurrent process arrive in batches of fixed size r, and are served by a single server whose ser- 
vice time density is b{x) such that 

b{x) = tie-^\ X > 0, /A > 0. 

Foster [4] also establishes heuristically some relations for the more general system GI7G/1. 
However, no such relations have systematically been reported for the system M^/G/1 in which 



QUEUEING SYSTEM M^/G/1 669 

the size of arrival groups is random— an assumption that would be better in many situations 
then the one when the size of arrival groups is constant. 

In this paper, we carry out an analysis similar to the one carried out by Chaudhry and 
Templeton [2], for the queueing system M-^/G/l wherein groups of random size, X, following a 
Poisson process join the system and are served individually by a single server whose service 
time distribution is general. In this way, we can, in principle, not only obtain relations between 
all the moments of queue size (for the queueing system MVG/1) at the three epochs of times 
under consideration, but also discuss some other interesting properties of MVG/1 and unify 
the results of many authors. As Foster did, so shall we use the term 'queue size' in all the 
three cases under discussion. 

The symbol Ej^ will indicate a modified Erlangian distribution with p.d.f. 

£ cMtit)'-^/ir - \)\}ti exp(-fit). 

Ej, which can be obtained from E/^ by putting c,. = 8,/,, where 8^^;. is a Kronecker symbol, will 
then be the usual Erlang (gamma) distribution which is the convolution of k exponential distri- 
butions with means 1//a. 



THE SYSTEM M^/G/1 

P(z) 

Let Nit) be a random variable (r.v.) representing queue size at time /. Let groups of cus- 
tomers arrive at epochs = o-q, o-j, . . . , a'„, ... with their size being a r.v., X, such that 

oo oo 

P(X = x) = Ox, ^ Ox '^ ^ and a = X ^^x < °°- 

The arrival epochs follow Poisson (random) distribution with mean 1/X. The service times of 
individual customers are independently distributed identical r.v.'s with common density bi\) 

such that \/fji = I vbiv)d\ < °°. Let a-,, o-i, ... , (t„, ... be the epochs of departures of 
customers from the system. 

Let us now define the following probabilities: 

(i) Pj = lim PiNit) = j) 

This means Pj is the limiting probability (as r — ► oo) of j in the system at a random 
epoch of time. 

(ii) P- = lim PiNi(T'„-0) = j) 

This means P~ is the limiting probability (as « — ► oo) of j in the system just before 
an arrival epoch. 

(iii) /»/ = lim PiN((T„ + 0) = j) 

n—'co 

This means P/ is the limiting probability (as « — » oo) of J in the system just after a 
departure epoch. 



670 ML. CHAUDHRY 

Assuming that the various Hmiting probabilities exist which they do when p = \al^ < 1, 
it will be established that 

^^^ P-{z) = Piz) = [ail - z)/[\ - A (z)]}P+(z), 



where 



are the p.g.f.'s. 



Piz) = Y, Pj^'' etc. 





The first equation of (1) is easily established because of the randomness of arrivals. To 
prove the second equation requires a bit of more rigorous argument. We first discuss P{z) and 
then its relation to f^Cz). To get P{z), we introduce the following notation. Let 

(a) T7(x) be the conditional service rate so that the service time density and distribution 
functions are, respectively, given by 



b{x) = y){x) exp | - J^ -qit) dt 
Bix) = 1 -exp [- J^\U) dt^ 



(b) P„(x. t) = lim [P[N{t) = n, X < X{t) > x + dx]|^x] 

Ax— 

where X{t) is the elapsed service time of the customer undergoing service at time t. 

(c) PqU) = P[N{t) ^Q] 

Now to find the p.g.f. Piz) of Nit) in the limiting case, we proceed as in Cox [2]. Since 
the arguments and steps in deriving the steady-state equations are the same, we only give the 
partial differential equations in the limiting case as r — » ■» with the notation 

lim P„ix, t) = P„ix), lim P^it) = Pq; 

(2) Q = -\Pq+ C Pyix)r)ix)dx, 

♦'0 

bP ix) " 

(3) —i = -ik+ 7iix))P„ix) + £ a,„P„_Jx), n > 1 

O^ m = l 

which are to be solved under the so-called boundary condition 

(4) /'.(O) = ^^ P„^,ix)y]ix)dx + \a„P^, n ^ 1 
and the normalizing condition 

oo _ oo 

(5) ^0+ Z Jo PM)dx = \. 

n = l 

Define the p.g.f.'s 

(6) P^iz\ x) = X nU)z", Aiz)= £ a,„z"' 

n = \ m = \ 

Once again applying Cox's procedure to equations from (2) and (3) and using (4) and 
(6), we get the following results: 

^^^ Poiz\ x) = Poiz; 0)(1 - Bix)) exp[\iA (z) - l)x] 



QUEUEING SYSTEM M-^/G/l 671 

and consequently 

/'o(z) = /q Pq{z\ x) dx 

= zPQ[b{\ - \A (z)) - l}/[z - b{\ - \A (z))], 
where 

^(a) = f" e-"'b{x) dx, 
and 

Po(z; 0) = \zPoU (Z) - l]/[z - bik - XA (z))], 
which can be obtained by using (4) and (7). 

Finally, 

^^^ P(z) = Poiz) + Po = Po(^ - 2)/[l - [z/bi\ - \A (z))}], 

where 

Po = 1 - p. 

The result given in (8) for the case when G = E/^ or Ei^ has been independently discussed 
by Gupta [8] and Restrepo [13] respectively. Gaver [6] obtained (8) by using renewal theoretic 
arguments. We can, as well, obtain (8) if we identify a group with a single customer so that its 
(group's) service time distribution is just the total service time of its members constituting the 
group and then use the results of a single server system M/G/1. However, as this has been 
pointed out earlier that the present approach is different, and it not only immediately leads to 
the result for P'^iz), but also unifies all the results reported so far, besides correcting an error 
in Krakowski's [11] results. 

P+(z) 

To get P'^(z) or relate P(z) to P'^iz), we first find P'^(z) and then the relation is easily 
established. To get P'^iz), we have 

J.oo 
^ P, + ,{x)T^ix) dx, 

where D is a normalizing constant. The p.g.f. P"^(z) is, then, given by 



P^(z) = £ p;z" 

J, CO CO 

I z''P„+,(x)t,(x) dx 

n = \ 

= iD/z)Poiz- 0) C 6(x)e-^"-^<--"^ dx 



'0 

^^^ = KDPoiA (z) - l)/[{z/MX - kA (z))} - 1], 

where we have used P(z; 0). Using the normalizing condition and the value of f(z), we get 

P+(z) = ^~ ^^'^ p(z) 
(1 -z)a 

as stated in (1). If a, = 8,i, then P+(z) = P(z), a result first established by Khintchine [10] 
and later by other authors. It can be shown that the result (9) agrees with the one obtained by 
Sahbazov [13] using the imbedded Markov chain procedure. Now we wish to discuss other 
interesting features of the system under consideration. 



672 ML. CHAUDHRY 

I. The system M^/M/1. This may be obtained from MVG/1 by putting G = E so that if 
7)(x) = /A, then (8) gives (since bia) = /i/(/u, + a)) 

^^^^ Piz) = (1 -p)(l -z)fjL/[,jL-z{,x+k-kA(z)}], 

which is Luchak's [12] result for M/Ej^-/!. It is thus interesting to see that the system MVM/1 
and M/E;j'/1 are equivalent, not only when P(X = r) = \ (see, e.g., later part of section 4.3.1 
of Gross and Harris [7]), but also when A' is a r.v. The equivalence of the more general sys- 
tems GI7M/1 and GI/Eyi wherein r is fixed has been considered by several authors, see, for 
example Foster [4]. It is possible, in principle, to show that GIVM/1 and GI/Ej^^/l are 
equivalent, but the analysis would be a bit more cumbersome. 

II. From the relation (1), one can easily see that 

The result (11) is new and exhibits an interesting phenomenon. It states that an observer is 
more likely to find the system empty (for a > 1) than a departing customer leaves it. Its accu- 
racy can easily be checked. For when a = 1, (11) reduces to the known relation Pq = Pq for 
the single server queueing system M/G/1. 

III. Another interesting case which is connected with the case II or equation (11) is the rela- 
tion between the imbedded Markov chain probabilities Pq and ttq. From Gross and Harris [7] 
or Harris [9], one can find that 

(12) ^0 = ^ ~ P = — ^ ^^, 

\ - p + a Po + a /*o + 1 

where the last result has been obtained by using the equation (11). 

IV. The various moments of queue size may be obtained from (1). In particular, if L~, L, L^ 
denote the expected queue sizes at the three epochs of time— just before arrival, random and 
just after departure, then one can easily see from (1) that the following relations must be 
satisfied. 

(13) L- = L = L"^ ; 

2 

where one may obtain L from (8) and is given by 

. pVV|+i) o-J + (a)2-fl 

U'=J; L = p -\ — r h p 



2(1 -p) 2(1 -p)a 



where 



crj = variance of the service time distribution, and 
(tI = variance of the group size distribution. 

One can see from (14) that the average number in the queue, L^, is given by 

(15) L.^L-p^^^^+p '^'^^'^'-' 

' 1-p 2(l-p)a 

where l/xR = {p^a-^ + 1) and R is the same as defined by Krakowski [11]. L^ is now the 
more general and correct form of Krakowski's [11] result (S.23) which is true only when the 
group size is constant rather than a random variable. Once L^ is known, one can obtain W^ 



f 



t 



QUEUEING SYSTEM M'^/G/1 673 

from Little's formula, L, = kalV^. Equation (13) shows that an observer (for a > 1) is more 
likely to see a shorter expected queue size than a departing customer leaves it, which is con- 
sistent with the remark made in case II. 

V. If one is interested in the distribution of the number in the queue, it may be obtained from 
P(z). For, if one defines, in the case when / — oo, A^^ as a r.v. for the number in the queue, 
then as 



^ = 1 



TV - 1. N ^ 2 
. N € I 



P^iz) = £[z^1 = Poiz - l)/[z - biX - XAiz)]. 

This has an interesting interpretation. It shows that the p.g.f. of the number in the system at a 
random epoch, for the bulk arrival system M^G/l, is equal to the p.g.f. of the number in the 
queue times the p.g.f. of the number that arrive during the service time of a customer. Such 
an interpretation for the sytem M/G/1 where-in arrivals are by singlets is well known. 

VI. An interesting result which falls outside the preceding results is the expected busy-period 
of the server. One way to find the expected busy-period of the server is to first find the distri- 
bution of busy-period, and then from it the expected value. However, its derivation by using 
an alternating renewal process is elegant. It is this approach that we adopt here. Since idle- 
periods and busy-periods generate an alternating renewal process, we have from the theory of 
renewal processes, 

E{X)/Em =p/(l - p), 

where EiX) and E(Y) are the expected busy and idle periods respectively. Now, since in 
MVG/1 by using the forgetfulness property of the exponential, 

E(Y) = \/\, EiX) =n/{fji-ka) 

which reduces to the well-known result for the queueing system M/G/1 if we take 

a, = 8,1. 

ACKNOWLEDGMENT 

The research for this paper was supported (in part) by the Defense Research Board of 
Canada, Grant Number 3610-603. The author is extremely grateful to a referee for pointing 
out the relation (12) and a few other useful recommendations. 

REFERENCES 

[1] Burke, P.J., "Delays in Single-Server Queues with Batch Input," Operations Research 23, 
830-833, 1975. 

[2] Chaudhry, M.L. and J.G.C. Templeton, "The Queueing System M/G*/l and its 
Ramifications," Under submission. 

[3] Cox, D.R., "The Analysis of Non-Markovian Stochastic Processes by the Inclusion of Sup- 
plementary Variables," Proceedings of the Cambridge Philosophical Society 51, 433- 
! 441, 1955. 

[4] Foster, F.G., "Batched Queueing Processes," Operations Research 12, 441-449, 1964. 

[5] Foster, F.G. and A.G.A.D. Perera, "Queues with Batch Arrivals II," Acta Mathematica 
Academiae Scientiarum Hungaricae, 16, 275-287, 1965. 



674 ML. CHAUDHRY 

[6] Gaver, D.P., "Imbedded Markov Chain Analysis of a Waiting Line Process in Continuous 

Time, Annals of Mathematical Statistics 30, 698-720, 1959. 
[7] Gross, D. and CM. Harris, Fundamentals of queueing theory (John Wiley and sons, 

1974). 
[8] Gupta, S.K., "Queues with Batch Poisson Arrivals and a General Class of Service Time 

Distributions," Journal of Industrial Engineering 15, 319-320, 1964. 
[9] Harris, CM., "Some Results for Bulk- Arrival Queues with State Dependent Service 
Times," Management Science 16, 313-326, 1970. 
[10] Khintchine, A., "Mathematical Theory of a Stationary Queue," Matemateceskii Sbornik, 

39, 73-84 (Russian), 1932. 
[11] Krakowski, M., "Arrival and Departure Processes in Queues," Revue Francaise Automa- 

tique Informatique et Recherche Operationelle V-1, 45-56, 1974. 
[12] Luchak, G., "The Continuous Time Solution of the Equations of the Single Channel 
Queue with a General Class of Service Time Distributions by the Method of Generat- 
ing Functions," Journal of Royal Statistics Society Service B20, 176-181, 1958. 
[13] Restrepo, R.A., "A Queue with Simultaneous Arrivals and Erlang Service Distribution," 

Operations Research 13, 375-381, 1965. 
[14] Sahbazov, A. A., "A Problem of Service with Non-Ordinary Demand Flow," Soviet 
Mathematics Doklady 3, 1000-1003, 1962. 



Mil 



ON THE MOMENTS OF GAMMA ORDER STATISTICS 



p. C. Joshi 

Department of Mathematics 

Indian Institute of Technology 

Kanpur, India 

ABSTRACT 

A recurrence relation between the moments of order statistics from the 
gamma distribution having an integer parameter r is obtained. It is shown that 

if the negative moments of orders —(r~\) —1 of the smallest order 

statistic in random samples of size n are known, then one can obtain all the 
moments. Tables of negative moments for r = 2 (1) 5 are also given. 



1. INTRODUCTION 

Let Z be a gamma random variable with probability density function 
(0 /(x) = e-^'x'-^lYir), x > 0, 

where r > 0. Let X^, Xj, ■■■ , ^„ be a random sample from (1), and Xx,„ ^ ^2.«^ ••• ^^n.n 
be the corresponding order statistics. Denote the / th moment of X^^ by a^'l. 

An expression for a^'l is given by Gupta [4] when r is an integer, and by Krishnaiah and 
Rizvi [6] for a general value of r. Tables of moments for selected values of n, k, r and / are 
given by Breiter and Krishnaiah [2] and Gupta [4]. The gamma order statistics and their 
moments are of great use in the analysis of life testing data, especially for r = 1 when the 
gamma distribution reduces to the exponential distribution. Some applications of gamma order 
statistics are discussed in Gupta [4] and Young [8]. 

The moments of order statistics are known to satisfy some recurrence relation, for exam- 
ple, see David ([3], pp. 36-38). In particular 



(2) 



j=n-k+\ 



n-k 



(_l)/-«+/c-l ^(0 



Thus the moments of Xi^„ can be obtained as a linear combination of moments of smal- 
lest order statistics in random samples of n — k + \ , . . . , n. In this paper, we derive another 
type of recurrence relation when r is an integer. In fact we show that higher order moments of 
Xi(„ can be obtained from those of the lower order. Recurrence relations of this type for 
specific distributions are given by Barnett [1] for Cauchy distribution, by the author [5] for the 
exponential and truncated exponential distributions, and by Shah [7] for logistic distribution. 
In addition we provide a table of negative moments aj^'l for r = 2 (1) 5 and /= — (/• — 



1), 



-1. 



675 



676 



PC. JOSHI 



2. THE NEGATIVE MOMENTS 



For the gamma random variable A' with density given by (1), the / th moment 

E(X') =rir + i)/rir) 

exists for all / > —r. Consequently, the / th moment of Xi^j, also exists for / > —r (David [3], 
pp. 25-26). When ris a positive integer, then 






x' {Fix)V-^{\-F(x)}''-'' fix) dx, 





{k - D! {n - k)\ 
where Fix) is the cumulative distribution function of Z given by 



Fix) = 1 



r-\ 



7=0 



x'lj\ 



X > 0. 



Gupta [4] has shown that this can be written as 






n\ 



k-\ 



(k - D! in - k)\ r(r) 



I (-1)' 



p=0 



k-\ 
P 



(r-\)(n-k+p) 

5^ a,„(r, n - k + p) 



r(r + i + m) 



m=0 



in - k + p + \) 



r + i+m 



where a^ir, p) is the coefficient of /"" in the expansion of 



. For r = 1 (1) 5, he has 



[.1=0 

used this relation for tabulating the first four moments of Xi^^ for \ ^ k ^ n ^ \0, and of 
A'l „ for rt = 11 (1) 15. In Table 1, we extend his tables to negative moments al'l for r = 2 
(1) 5, /=-(/•- 1). .... -1 and 1 < A: ^ « ^ 10, and a/'i for n = \\ (1) 25 and same 
values of r and /. These were evaluated to eight significant digits and are correct to the five 
decimal places as tabulated. For « ^ 10, they were also checked by using the identity 



k = \ 



3. THE RECURRENCE RELATION 

In [5], the author has shown that for the exponential distribution (r = 1) 
a('> = ai'2i,„_, + (//«)«('-", i = \, 2. ...-l ^ k ^ rt. 
where we follow the conventions 
<3) a^o„> = l. l^k^n, 



(4) 



aJV = 0. / = 1, 2, ... ; r = 0, 1, 2, .. 



This recurrence relation was then extended to the right truncated exponential distribution. We 
now generalize this result in another direction to the gamma distribution and show that for 
integral values of r 



r-\ 



(5) 



/=0 



MOMENTS OF GAMMA ORDER STATISTICS 



677 



for / = 1, 2, . . . , 1 < /c < «, where the conventions given at (3) and (4) are followed. To 
this end, let 



hix) = - 



/■- 1 
I 

7=0 



k-l 



e "x^lj 



7=0 



n-k + \ 



then 



h'{x) = 



1 


7-0 


~''xJlj\ 


k-2 


r-l 

I' 

7=0 


r^xVy! 


n- 


-k 






n 


r-\ 
7=0 


e-'^xJ/jl 


- (A:-l) 


t 



e-^x^-1 



(r-D! 



and 



„ (') _ „ (') _ 

(^k,n ^k-\.n-\ — 



n-\ 
k-\ 



r x'/?'(x) dx. 

•Jo 

Integrating by parts, by treating x' for differentiation and /?'(x) for integration, we have 

r°° / x'-^ h(x) dx 



«A:,n <^k-\.n-\ — 



n-\ 
k-\ 






n-\ 
k-\ 



J.oo 
x'-i 




r-l 
7=0 



e-'xJlj\ 



k-\ 



le-'xVy! 

7=0 



n-k 



r-l 

£ e-^x'//! 

r-0 



dx. 



Taking the sum over / outside the integral sign, multiplying and dividing by r(r), and integrat- 
ing term by term we get 

«|'i - oci'\„_, = Uln) r(r) £ ai[:'-^'lt\, 



f=0 



which proves the results. 



Relation (5) expresses the / th order moment of X,^„ in terms of / th order moment of 
A'yt-i.rj-i and lower order moments of Xk_„. In particular, it gives the mean of Xx „ in terms of 
moments of orders —{r — 1), .... —1 of A'l „, the second moment of Zj „ in terms of 
moments of orders — (r — 2), .... —1, 0, 1 of X\n, etc. Taken together with relation (2), it 
shows that if the negative moments of orders — (r — 1), . . . , — 1 of the smallest order statistic 
in samples of size j ^ n are known, then one can calculate all the moments a^'l for 
/ = 1, 2, ... and \ ^ k ^ n. 

It should be noted that only non-negative terms are added for the evaluation of al'l in 
equation (5). Consequently, the rounding errors are negligible for small values of r. This is 
illustrated in the following example. 

EXAMPLE : r = 2 and ^ = 1. In this case equation (5) reduces to 

(6) af'> = (i/n) (a/,'-2> + a/'"')), / = 1,2 

Thus for n = 10, say, we have from Table 1, a/~io = 3.66022. Equation (6), then gives 
a,% = 0.46602, a,% = 0.29320, «!% = 0.22777, a,% = 0.20839 etc. These values agree 
perfectly with the values evaluated directly (see also Gupta [4]). 



678 



PC. JOSHI 



For all values of n, r and k, the moments of order /, 1 ^ / ^ 4, obtained by direct 
evaluation and by recurrence relation (5) agree up to eight significant digits, the digits up to 
which the calculations were performed and on which Table 1 is based. 



TABLE 1. Table of Negative Moments EiXj^„) of Gamma Order Statistics 

forr = 2 (1) 5. 



n 


r 


2 


3 


4 


5 


-1 


-2 


-1 


-3 


-2 


-1 


-4 


-3 


-2 


-1 


1 


1 


1 .00000 


0.50000 


0.50000 


0.16667 


0.16667 


0.33333 


0.04167 


0.04167 


0.08333 


0.25000 


2 


1 


1.50000 


0.87500 


0.68750 


0.31250 


0.27083 


0.43750 


0.08073 


0.07422 


0.12891 


0.31836 


2 


2 


0.50000 


0.12500 


0.31250 


0.02083 


0.06250 


0.22917 


0.00260 


0.00911 


0.03776 


0.18164 


3 


1 


1.88889 


1.20370 


0.82099 


0.44833 


0.35566 


0.50810 


0.11832 


0.10295 


0.16424 


0.36321 


3 


2 


0.72222 


0.21759 


0.42052 


0.04084 


0.10118 


0.29629 


0.00555 


0.01676 


0.05824 


0.22866 


3 


3 


0.38889 


0.07870 


0.25849 


0.01083 


0.04316 


0.19560 


0.00113 


0.00529 


0.02752 


0.15813 


4 


1 


2.21875 


1.50415 


0.92792 


0.57746 


0.42950 


0.56292 


0.15485 


0.12928 


0.19403 


0.39733 


4 


2 


0.89931 


0.30236 


0.50020 


0.06093 


0.13415 


0.34365 


0.00872 


0.02397 


0.07488 


0.26085 


4 


3 


0.54514 


0.13282 


0.34085 


0.02074 


0.06820 


0.24894 


0.00238 


0.00954 


0.04159 


0.19646 


4 


4 


0.33681 


0.06066 


0.23103 


0.00753 


0.03482 


0.17783 


0.00071 


0.00388 


0.02283 


0.14536 


5 


1 


2.51040 


1.78463 


1.01852 


0.70158 


0.49594 


0.60831 


0.19056 


0.15387 


0.22020 


0.42517 


5 


2 


1.05215 


0.38222 


0.56550 


0.08102 


0.16375 


0.38135 


0.01202 


0.03089 


0.08935 


0.28598 


5 


3 


0.67004 


0.18258 


0.40225 


0.03081 


0.08975 


0.28710 


0.00375 


0.01360 


0.05317 


0.22314 


5 


4 


0.46187 


0.09965 


0.29992 


0.01403 


0.05383 


0.22349 


0.00147 


0.00683 


0.03387 


0.17868 


5 


5 


0.30554 


0.05092 


0.21381 


0.00590 


0.03006 


0.16641 


0.00053 


0.00314 


0.02007 


0.13703 


6 


1 


2.77469 


2.04988 


1.09789 


0.82168 


0.55693 


0.64736 


0.22559 


0.17713 


0.24377 


0.44883 


6 


2 


1.18894 


0.45840 


0.62168 


0.10103 


0.19097 


0.41309 


0.01543 


0.03757 


0.10235 


0.30684 


6 


3 


0.77856 


0.22986 


0.45312 


0.04100 


0.10931 


0.31788 


0.00521 


0.01754 


0.06337 


0.24427 


6 


4 


0.56151 


0.13531 


0.35138 


0.02061 


0.07019 


0.25633 


0.00229 


0.00965 


0.04297 


0.20202 


6 


5 


0.41205 


0.08182 


0.27419 


0.01074 


0.04566 


0.20707 


0.00106 


0.00542 


0.02932 


0.16701 


6 


6 


0.28424 


0.04474 


20174 


0.00493 


0.02694 


0.15828 


0.00042 


0.00269 


0.01822 


0.13103 




1 


3.01814 


2.30292 


1.16897 


0.93848 


0.61369 


0.68180 


0.26003 


0.19932 


0.26535 


0.46951 




2 


1.31401 


0.53162 


0.67144 


0.12093 


0.21637 


0.44071 


0.01891 


0.04403 


0.11424 


0.32479 




3 


0.87628 


0.27533 


0.49728 


0.05127 


0.12748 


0.34404 


0.00674 


0.02140 


0.07262 


0.26198 




4 


0.64827 


0.16924 


0.39424 


0.02730 


0.08509 


0.28299 


0.00318 


0.01240 


0.05103 


0.22065 




5 


0.49645 


0.10986 


0.31923 


0.01560 


0.05901 


0.23633 


0.00163 


0.00758 


0.03693 


0.18805 




6 


0.37829 


0.07060 


0.25617 


0.00880 


0.04032 


0.19537 


0.00083 


0.00455 


0.02628 


0.15859 




7 


0.26856 


0.04043 


19266 


0.00429 


0.02471 


0.15210 


0.00035 


0.00238 


0.01688 


0.12644 


8 


1 


3.24502 


2.54586 


1.23362 


1.05244 


0.66703 


0.71273 


029397 


0.22060 


0.28537 


0.48792 


8 


2 


1.42998 


0.60239 


0.71637 


0.14071 


0.24031 


0.46528 


0.02244 


0.05032 


0.12526 


0.34060 


8 


3 


0.96608 


0.31934 


0.53667 


0.06159 


0.14457 


0.36699 


0.00832 


0.02517 


0.08116 


0.27735 


8 


4 


0.72662 


0.20197 


0.43165 


0.03407 


0.09900 


0.30580 


0.00411 


0.01511 


0.05839 


0.23637 


8 


5 


0.56992 


0.13650 


0.35684 


0.02053 


0.07118 


0.26018 


0.00225 


0.00970 


0.04368 


0.20492 


8 


6 


0.45236 


0.09387 


0.29666 


0.01265 


0.05170 


0.22203 


0.00126 


0.00632 


0.03288 


0.17793 


8 


7 


0.35360 


0.06285 


0.24267 


0.00752 


0.03653 


0.18648 


0.00068 


0.00397 


0.02408 


0.15214 


8 


8 


0.25641 


0.03723 


0.18552 


0.00382 


0.02303 


0.14718 


0.00031 


0.00215 


0.01585 


0.12277 


9 


1 


3.45832 


2.78021 


1.29314 


1.16395 


0.71753 


0.74089 


0.32747 


0.24112 


0.30409 


0.50457 


9 


2 


1.53864 


0.67103 


0.75749 


0.16036 


0.26303 


0.48749 


0.02601 


0.05645 


0.13558 


0.35478 


9 


3 


1.04970 


0.36212 


0.57242 


0.07194 


0.16078 


0.38753 


0.00994 


0.02887 


0.08914 


0.29097 


9 


4 


0.79884 


0.23377 


0.46516 


0.04090 


0.11214 


0.32591 


0.00507 


0.01777 


0.06521 


0.25009 


9 


5 


0.63636 


0.16223 


0.38976 


0.02553 


0.08257 


0.28066 


0.00290 


0.01178 


004986 


0.21923 


9 


6 


0.51677 


0.11592 


0.33050 


0.01653 


0.06207 


0.24379 


0.00173 


0.00803 


0.03874 


0.19347 


9 


7 


0.42016 


0.08284 


0.27974 


0.01070 


0.04651 


0.21114 


0.00103 


0.00546 


0.02995 


0.17016 


9 


8 


0.33459 


0.05714 


0.23208 


0.00661 


0.03367 


0.17943 


0.00058 


0.00354 


0.02241 


0.14699 


9 


9 


0.24664 


0.03474 


0.17970 


0.00348 


0.02170 


0.14315 


0.00027 


0.00197 


0.01503 


0.11974 



MOMENTS OF GAMMA ORDER STATISTICS 



679 



TABLE 1 (Continued). Table of Negative Moments E(Xl^ „) of Gamma Order Statistics 

forr = 2 (1) 5. 



\^ r 
n k^\ 


2 


3 


4 


5 


-1 


-2 


-1 


-3 


-2 


-1 


-4 


-3 


-2 


-1 


10 1 


3.66022 


3,00714 


1,34843 


1,27330 


0,76562 


0.76678 


0,36057 


0.26097 


0.32173 


0,51978 


10 2 


1.64122 


0.73783 


0,79554 


0,17988 


0,28472 


0.50782 


0,02961 


0.06243 


0.14532 


0,36767 


10 3 


1.12832 


0,40384 


0,60530 


0,08229 


0,17626 


0.40619 


0,01160 


0.03251 


0,09665 


0.30325 


10 4 


0.86626 


0.26478 


0,49570 


0,04778 


0,12466 


0,34399 


0,00607 


0.02039 


0,07161 


0.26232 


10 5 


0.69769 


0.18726 


0,41935 


0,03058 


0,09336 


0,29879 


0,00357 


0,01383 


0,05561 


0,23176 


10 6 


0.57502 


0,13720 


0,36018 


0,02047 


0,07178 


0,26254 


0,00222 


0,00973 


0.04411 


0,20670 


10 7 


0.47793 


0,10173 


0,31072 


0,01391 


0,05560 


0,23130 


0,00140 


0,00691 


0,03516 


0,18466 


10 8 


0.39541 


0,07475 


0,26646 


0,00933 


0,04262 


0.20250 


0,00087 


0,00484 


0,02772 


0,16394 


10 9 


0.31938 


0,05273 


0,22349 


0,00593 


0,03144 


0,17367 


0,00051 


0,00322 


0,02108 


0,14275 


10 10 


0.23856 


0,03274 


0,17484 


0,00320 


0,02061 


0,13976 


0,00024 


0,00184 


0,01436 


0,11718 


11 1 


3,85237 


3,22756 


1,40017 


1,38070 


0,81163 


0,79080 


0.39330 


0,28024 


0,33845 


0,53381 


12 1 


4.03607 


3,44218 


1,44888 


1.48635 


0,85582 


0,81323 


0,42570 


0,29899 


0,35437 


0,54684 


13 1 


4,21235 


3.65161 


1,49497 


1.59041 


0,89840 


0,83430 


0,45779 


0.31727 


0,36958 


0,55902 


14 1 


4.38203 


3,85633 


1,53876 


1.69301 


0,93953 


0.85418 


0.48961 


0.33512 


0,38418 


0,57047 


15 1 


4.54581 


4,05677 


1,58052 


1,79426 


0,97938 


0,87302 


0.52116 


0.35258 


0,39822 


0,58128 


16 1 


4.70426 


4,25327 


1.62047 


1.89425 


1,01804 


0.89094 


0,55246 


0.36969 


0,41175 


0,59152 


17 1 


4.85787 


4.44615 


1,65880 


1.99308 


1,05563 


0.90804 


0,58353 


0,38646 


0,42484 


0,60125 


18 1 


5,00706 


4,63566 


1,69566 


2.09082 


1,09224 


0.92439 


0,61438 


0,40294 


0,43751 


0,61054 


19 1 


5.15220 


4.82204 


1,73118 


2.18754 


1,12794 


0.94008 


0,64503 


0,41912 


0,44980 


0,61941 


20 1 


5.29358 


5,00550 


1.76548 


2.28329 


1,16279 


0.95515 


0,67548 


0,43504 


0,46173 


0,62792 


21 1 


5.43150 


5.18622 


1.79865 


2.37812 


1.19686 


0.96967 


0,70574 


0,45071 


0,47335 


0,63609 


22 1 


5.56620 


5.36436 


1.83079 


2.47210 


1.23021 


0.98368 


0,73583 


0,46615 


0.48467 


0,64395 


23 1 


5.69788 


5.54007 


1.86198 


2,56525 


1.26287 


0,99721 


0,76574 


0,48136 


0.49570 


0,65152 


24 1 


5.82675 


5.71349 


1.89227 


2,65762 


1.29488 


1.01031 


0.79550 


0.49636 


0,50647 


0,65884 


25 1 


5.95298 


5.88473 


1.92174 


2.74924 


1.32630 


1.02300 


0,82510 


0.51117 


0,51700 


0,66591 



ACKNOWLEDGMENT 

The author wishes to thank the referee for some helpful suggestions in the preparation of 
'.his paper. 



REFERENCES 

il] Barnett, V.D., "Order Statistics Estimators of the Location of the Cauchy Distribution," 

Journal of American Statistical Association 61 1205-18 (1966). Correction 63, 383-5 

(1968). 
'2] Breiter, M.C. and P.R. Krishnaiah, "Tables for the Moments of Gamma Order Statistics," 

SankhyaB 30, 59-72 (1968). 
:3] David, H.A., "Order Statistics," (Wiley: New York 1970). 

4] Gupta, S.S., "Order Statistics from the Gamma Distribution," Technometrics 2, 243-62 
I (1960). 
Is] Joshi, P.C., "Recurrence Relations Between Moments of Order Statistics from Exponential 

and Truncated Exponential Distributions," Sankhya B 39, 362-71 (1978). 
;6] Krishnaiah, P.R. and M.H. Rizvi, "A Note on the Moments of Gamma Order Statistics," 

Technometrics 9, 315-8 (1967). 
17] Shah, B.K., "Note on the Moments of a Logistic Order Statistics," Annals of Mathematical 
' Statistics 41, 2150-2 (1910). 

I'S] Young, D.H., "Moment Relations for Order Statistics of the Standardized Gamma Distribu- 
1 tion and the Inverse Multinomial Distribution," Biometrika 58, 637-40 (1971). 



A NEW STORAGE REDUCTION TECHNIQUE FOR 
THE SOLUTION OF THE GROUP PROBLEM 

Richard V. Helgason and Jeff L. Kennington 

Department of Operations Research 

and Engineering Management 

Southern Methodist University 

Dallas, Texas 

ABSTRACT 

This paper shows that by making use of an unusual property of the decision 
table associated with the dynamic programming solution to the goup problem, 
it is possible to dispense with table storage as such, and instead overlay values 
for both the objective and history functions. Furthermore, this storage reduc- 
tion is accomplished with no loss in computational efficiency. An algorithm is 
presented which makes use of this technique and incorporates various addi- 
tional efficiencies. The reduction in storage achieved for problems from the 
literature is shown. 

I. INTRODUCTION 

The group theoretic approach to integer programming was first presented by Ralph 
Gomory [4] in 1969. The basic theoretical results may be found in [3, 4, 7, 9, 10, 11] and 
computational experience with variations of the approach may be found in [2, 5, 6]. The first 
step is to solve the continuous relaxation of the integer program. If the solution is integer, the 
problem is solved. If not, one then uses the optimal linear programming basis to derive a 
relaxation of the integer program known as the group (knapsack) problem. Mathematically, the 
group problem may be assumed to take the following form: 

min Y ^i^i 
(=1 

(1) s.t. Xs,-^, = ^(mod e) 

/=i 

X, a non-negative integer for all t, 

where g,, ... , g^, d, and e are known integer r-vectors and the c,'s are known non-negative 
scalars. The details for obtaining the group problem in the above form may be found in [3, 7, 
9]. The integer vectors g, may be used to generate an abelian group under addition modulo e 
and hence the names group theoretic approach and group problerrt. 

The next step in the group theoretic approach is to solve (1). Gomory [4] presents a sim- 
ple dynamic programming algorithm for solving this problem. Dynamic programming (see [1, 
8]) is a multi-stage solution procedure in which a recursive relation is used to compute columns 



f 



681 



682 R.V. HELGASON & J.L. KENNINGTON 

in a decision table. Each successive column represents a further stage in the optimization pro- 
cedure. A group problem with q columns will require a ^-stage optimization. At the conclusion 
of" the q^^ stage, the solution may be recovered by backtracking through the decision table. 

In this paper, we present an unusual property of this particular decision table which allows 
one to recover the optimal solution from the information associated with only the q^^ stage. 
Consequently, we can dispense with table storage as such for the dynamic programming tech- 
nique by overlaying all table values at successive stages. 



II. SOLVING GROUP PROBLEMS VIA DYNAMIC PROGRAMMING 

Consider the following two-parameter class of group minimization problems over the 
group G = {^0. ^i- ••• . ^m-i), 



i 



mm 






i = k 

s.t. ^Q,x, = gi (mod e) 



Xi, a non-negative integer for all / 



where the integers / and k are bounded by < /< w — 1 and \ ^ k ^ q. go = 0, the zero 
vector. Let fn^ denote the optimal objective value of Pn^ if a solution exists and let //^ = <», 
otherwise, /qo is taken to be while //o = <» for all / ^ 1. At the /c"^ stage of the dynamic pro- 
gramming solution procedure, one must find /^^^ for r = 0, 1, . . . , w — 1; that is, (1) is solved 
using only x,, ... , Xi^ and with all group elements as right-hand sides of the congruence. If /* 
is such that gi> = d, then the solution of P/.^ is also the solution of (1). 

A dynamic programming algorithm can be developed for P/^ by noting that the solutions 
to Pik can be partitioned into those with X/^ = and those with x^ ^ \. Let /'be defined such 
that gi' = gi — Qk- Then the above is equivalent to saying that for X/^ = 0, fn^ = //jt-i; and for 
Xk ^ li //Ar = t'A: + fi'k- Thus a recursive equation may be written as follows: 

fik = min [fi^k-\> Ck + frk}- 

To apply (2), one must be able to compute fik for all /. This is always possible if g^^ gen- 
erates the whole group. In the case where Qk does not generate G, the following procedure is 
used. For each coset, one chooses an element gi^ and sets f*k = fr,k-\- Then one generates 
elements of the coset successively using gi = g/^ -I- ag^, for a = 1, 2, . . . , and computes 

ftk = min {fi_k-\, Ck + fi'k) 
with /* determined by g/. = gi — Qk- 

The above procedure is cyclic, and should be terminated when all new ffk agree with 
those computed in the previous cycle. Then, fik is taken to be /*^. Termination is guaranteed 
within two cycles and occurs anytime during the second cycle when any f*k agrees with its pre- 
vious value. Obviously this procedure may also be applied when g^ generates the whole group 
since the coset of interest is G. Justification for the above procedure may be found in [3, 7, 9, 



STORAGE REDUCTION TECHNIQUE 683 

10]. In order to recover the solution at the termination of stage q, one also carries a history 
function, which keeps track of the variable used in (3) to obtain the minimum. Note that in 
case //jt-i = Ck + ff'k, an arbitrary decision is possible. This gives rise to a number of possible 
realizations of the history function, corresponding to the combinations of alternate optima for 
each problem f/^. To facilitate the intended storage reduction, we define a particular history 
function as follows: 



(4) h„ = 



^u-i. if//* = fi.k-\. and 
k, otherwise. 



Note that this records the most recent stage for which a strict decrease occurred in the objective 
value associated with gi. Thus, in case //,t-i = Q + //*•*;, one does not use the variable associ- 
ated with the new stage. 

A naive implementation of the above algorithm requires a decision table with mqit + p) 
bits where / is the number of bits required to carry the //^^'s and p must be such that 
2'' '^ q + \ (i.e., with p bits we represent the numbers to q). This algorithm as described in 
[7, 9, 10] requires the full table size while the presentation in [3] assumes a table with 
mqit + 1) bits. However, all values of /<./ and //^ need not be saved since the recursions (2) 
and (3) can be executed with partial information about the current stage and partial information 
about the previous stage. Therefore, one may easily, implement the algorithm with a table of 
size mit + q) bits. Even so, the dynamic programming decision table may become quite large. 
We remark that with a table of size m{t + q) bits it is possible to recover all alternate optima. 

Implementation of the procedure may be enhanced if the group elements, 
G = [gQ, g\, .... ^„-i}, are ordered. Since < ^, < c for all /, there is a natural ordering of 
the group elements. For any element of G, say /3 = [/3i, . . . , /3j, we assign the order of /3, 
denoted /(/3), as follows: 



1 = 2 



k = i-\ 

n ^. 

k = \ 



This corresponds to array subscripting, using r subscripts with the first varying most rapidly. 
Using the above ordering, the recovery procedure is quite simple and is given below: 

1. [Initialize Variables] x, ■^ 0, / = 1, . . . , q 

2. [Start at d\ g *- d 

3. [Start at stage q] k '— q 

4. [Reference history] k *— h^^) ,^ 

5. [Add to solution] x^^ *— x^ + 1 

6. [Backtrack] g'— g — Qi, 

7. [Done?] If ^ ?^ 0, go to 4; otherwise, terminate. 

Step 4 above implies that the history function for each group element, G = [gQ, ... , g,„-\}, 
and each stage, k = I, .... q, must be available for solution recovery. 



684 R.V. HELGASON & J.L. KENNINGTON 

III. DYNAMIC PROGRAMMING ALGORITHM USING THE 
STORAGE REDUCTION TECHNIQUE 

In this section it is shown that h^g) j^ may be replaced by /j/(g),, in step 4 of the recovery 
procedure. Hence, only the q^^ stage history function need be available for solution recovery. 
Consider the following propositions. 

PROPOSITION 1: If hn, = j ^ k, then /?y = hij+^ = . . . = h,,, = j. The proof of proposi- 
tion 1 is obvious by the definition of the history function given in (4). 

PROPOSITION 2: For all integers / and k with < / < m - 1 and 1 < A: < 9, if 
j = h,^, then hf,, ^ 7 where gr = gi - Qj. 

PROOF: Choose / and k arbitrarily and let j = hi^. Let g^ be such that gr = gi — g,, and 
let j' = hf^. We must show that j' ^ J. Assume the contrary. Then by Proposition 1 and the 
definition of the history function, 

(5) 
(6) 
(7) 
(8) 
(9) (8) implies that frj > fry. 

From (6) we have that 

f,r = frj + Cj. 

Let [x] = a\, . . . , Xj = Uj, .... Xy' = uj] denote an optima for //y. Then [x, = ai, . . . , 
Xj = uj + \, . . . , Xj' = aj] is feasible for P/y since gj = gr + g, and has value fry + Cj. Since 
the objective value for an optima of fy must be less than or equal to the objective value for 
any feasible solution, frj + Cj < frj' + Cj. This implies frj < f-j- which contradicts (9). 
Therefore j' < j and the proposition is proved. 

We now use the above propositions to prove that /j/(g),, can replace /i/(g)j in step 4 of the 
recovery algorithm. 

PROPOSITION 3: For all integers / with ^ / ^ m - 1, if J = h,^, then hr^ = hrj 
where gr = gi - Qj. 

PROOF: Choose /arbitrarily and let j = /?,,. Let gr be such that gr = gi - Qj- By Propo- 
sition 2, hrq ^ / Then from Proposition 1 /?/■, = hrj. 

Since only the q^^ stage history function is required for solution recovery, we drop the | 
subscript associated with the stage for both the //^'s and h„^''s. The complete dynamic program- 
ming algorithm may then be stated as follows: 



^//t = • 


. = hij' = . . . = hij = j with 


//. = .. 


■= fir= ■■■= fij < fi.j-i; and 


hfk = • 


. . = hry ^ hry-\ with 


frk = . 


• • = //■/ < //',/-! < • • • =^ //v 



I 0V( 



STORAGE REDUCTION TECHNIQUE 685 

REDUCED STORAGE D.P. ALGORITHM FOR THE GROUP PROBLEM 

L Initialize 

a. [Objective values] /o "" 0; /, <— oo, / = 1, . . . , m — \. 

b. [History values] h^ *— q -\- \\ i = Q, \, ... , m - \. 

c. [Stage counter] k ^ 0. 

2. Begin New Stage 

a. [Increment Stage counter] k *— k + \. 

b. [Flag group elements not updated] h, * /i,, / = 0, 1, . . . , m - \. 

c. [Select first coset element] g *— d + Qi^ 

3. Find a Coset Element Which May Lead to an Improved Solution 

a. [Save starting index] /* ^ l{g) 

b. [Test objective value] If //(^) < fad), go to 4. 

c. [Generate another coset element] g *— g + Qk- 

d. [Coset exhausted?] If l{g) ;^ /*, go to b. 

e. [Flag coset elements updated] h,(g) ^ \hi(g)\ for all g in the 
coset containing g,*; go to 5. 

4. Apply Recursion to Coset 

a. [Starting value is previous value] v — fug). 

b. [Possible next value using stage k generator] v ♦— v + c^. 

c. [Flag element updated] h^g) "— |/7/(^)|. 

d. [Generate another coset element] g "— g + Qk- 

e. [Test for minimum] If fug) ^ v, go to h. 

f. [Decrease in value] f^g) «— v. 

g. [Update history] hug) *— k ; go to b. 

h. [Element previously updated?] If hi(g) > 0, go to 5; otherwise, go to a. 

5. Test to Terminate, Go to Next Stage, or Update Another Coset 

a. [Last stage?] If k = q, recover solution and terminate. 

b. [Test for another coset to update] Let g be an element for which 
hug) < 0. If none, go to 2; otherwise go to 3. 

The above algorithm is essentially equivalent to the original procedure presented by 

i^Gomory [4], except that no arbitrary decisions were possible using (3) and all information is 

"overlaid in the decision table. Hence, our approach may save considerable core storage with no 

loss in efficiency. Furthermore, additional computational efficiencies have been incorporated as 

follows: 



686 



R.V. HELGASON & J.L. KENNINGTON 



(i) At any stage, if no element of a particular coset has objective value less that //(rf), the 
entire coset is not actually updated since any problem solution so derived cannot be 
part of an optimal solution to Pi((i).k- 

(ii) At each stage, the coset containing d is updated first. Thus at stage q we can ter- 
minate without updating any other cosets, and during earlier stages, a better value for 
//((/) is used in the above test. 



(iii) By using one additional bit per group element we flag cosets. 
considered more than once. 



Hence, no coset is ever 



In the algorithm presented, the flag bit is the sign of hi. 

This new procedure is implemented with m (t + p + \) bits, as compared to m(t + q) 
bits for an efficient implementation not using the overlay feature, where t denotes the number 
of bits required to store an // and p is selected such that 2^ > ^ + 2. Table 1 presents a com- 
parison of storage requirements on typical group problems taken from [6], with t taken as 
representative of the word size in bits for two classes of machines. The storage savings ranges 
from 12 to 85% with an average of approximately 50%. 



TABLE 1. Comparison of Table Size for Dynamic Programming Algorithm 



Table # 


Problem # 


Basis 
Determinant 


# Nonbasics 


2'' > q + 2 


IBM 360/370 
(32 bit words) 


CDC 6000/7000 
(60 bit words) 












Standard Table 


Reduced Table 


% Savings 


Standard Table 


Reduced Table 


% Savings 


(see 161) 


(see 161) 


m 


« 


P 


m(32 + q) 


m(32 + /) + 1) 


111-121 


m(60 + q) 


m(60 + p + 1) 


131-141 












(11 


121 


111 


131 


141 


131 


3a 


5 


24 


240 


8 


6528 


984 


85 


7200 


1656 


77 


J 




10 


144 


36 


6 


9792 


5616 


43 


13824 


9648 


30 






15 


180 


240 


8 


48960 


7380 


85 


54000 


12420 


77 






20 


280 


109 


7 


39480 


11200 


72 


47320 


19040 


60 






25 


512 


140 


7 


69632 


20480 


71 


83968 


34816 


59 


\ 


' 


30 


1080 


109 


7 


152280 


43200 


72 


182520 


73440 


60 


3a 


35 


2048 


104 


7 


278528 


81920 


71 


335872 


139264 


59 


3b 


2 


48 


195 


8 


10896 


1969 


82 


12240 


3312 


73 


ii 


4 


128 


14 


4 


5888 


4736 


20 


9472 


8320 


12 




6 


864 


36 


6 


58752 


33696 


43 


82944 


57888 


30 


u 


g 


5025 


18 


5 


251250 


190950 


24 


391950 


331650 


15 


3b 


10 


6912 


36 


6 


470016 


269568 


43 


663552 


463104 


30 



IV. SUMMARY 

We have shown that by using a history function which records only the most recent stage 
at which a strict decrease occurred in the objective value associated with each group element, it 
is possible to dispense with table storage as such, and overlay values both for the objective and 
history functions. We have shown that this storage reduction may be accomplished with no loss 
in computational efficiency and have incorporated this technique into a highly efficient algo- 
rithm. Our procedure does not allow for the recovery of alternate optima. However, by 
dynamically storing a partial table consisting of all occurrences of ties in (3) following a strict 
decrease of //^^ (^s given by Z?/^), alternate optima may be determined. 

REFERENCES 



[1] BeUman, R.E. and S.E. Dreyfus, Applied Dynamic Programming (Princeton University 
Press, Princeton, New Jersey, 1962). 

[2] Fisher, M.L., W.D. Northup and J.F. Shapiro, "Using Duality to Solve Discrete Optimiza- 
tion Problems: Theory and Computation Experience," Mathematical Programming 
Study i, 56-94 (1975). 



STORAGE REDUCTION TECHNIQUE 687 

[3] Garfinkel, R.S. and G.L. Nemhauser, Integer Programming (John Wiley and Sons, New 
York, New York, 1972). 

[4] Gomory, R.E., "Some Polyhedra Related to Combinatorial Problems," Linear Algebra and 
Its Applications, 2, 451-558 (1969). 

[5] Gorry, G.A. and J.F. Shapiro, "An Adaptive Group Theoretic Algorithm for Integer Pro- 
gramming," Management Science, 17(5), 285-306 (1971). 

[6] Gorry, G.A., W.D. Northrup and J.F. Shapiro, "Computational Experience With a Group 
Theoretic Integer Programming Algorithm," Mathematical Programming, 4, 171-192 
(1973). 

[7] Hu, T.C., Integer Programming and Network Flows (Addison- Wesley, Reading, Mass., 
1969). 

[8] Nemhauser, G.L., Introduction to Dynamic Programming (John Wiley and Sons, New York, 
New York, 1966). 

[9] Salkin, H.M., Integer Programming (Addison-Wesley Publishing Company, Reading, Mass., 
1975). 
[10] Taha, H.A., Integer Programming: Theory, Applications, and Computations (Academic Press, 

New York, New York, 1975). 
[11] Zionts, S., Linear and Integer Programming (Prentice-Hall, Inc., Englewood Cliffs, New Jer- 
sey, 1974). 



EXPERIMENTS WITH LINEAR FRACTIONAL PROBLEMS 

Gabriel R. Bitran 

Massachusetts Institute of Technology 
Cambridge, Massachusetts 

ABSTRACT 

In this paper we present the results of a limited number of experiments 
with linear fractional problems. Six solution procedures were tested and the 
results are expressed in the number of simplex-like pivots required to solve a 
sample of twenty problems randomly generated. 

Two main approaches emerge from the literature to solve the linear fractional problem: 

^^^ V = max{/(x) = n(x)/d(x): x^F] 

where n(x) = Cg + ex, d(x) = d^ + dx, F = [x^R": Ax = b x "^0), Cg and dg are real 
numbers, c and d are real n-vectors, A is an mxn real matrix and 6 is a real m-vector. We 
assume in this note that Fis compact and that mm[d(x): x^F] > 0. 

Charnes and Cooper [4] transform problem (P) into the linear program: 

V = max[cgt + cy: Ay — bt = 0, dgt + dy = \, and t, y > 0}. 

This approach has been extended to the nonlinear versions of (P) by Bradley and Frey [3] and 
Schaible [8]. The second approach solves a sequence of linear problems or at least one pivot 
step of each linear program over the original feasible set by updating the objective function. 
Algorithms in this category are related to ideas first presented by Isbell and Marlow [5] and 
Martos [6]. Similar algorithms have been proposed by several other authors. The interested 
reader is referred to the excellent bibliography collected by I.M. Stancu-Minasian [9]. Methods 
in the second approach propose to solve (P) through a sequence of linear programs: 

r(x*) =max{r(x\ x) = n{x) - fix>^d(x): x€F}A: = 0, 1, 2 ... (LP,,) 

where x° is a given feasible point and x*^ for /c > 1 is defined in Isbell and Marlow's procedure 
as being the optimal solution to (LP^^-i) and as the first feasible basis in {LP^^x) for which 
r{x'^~^, x) > in Martos's procedure. Both algorithms terminate at iteration kg for which 

k k 

rix ") = 0. In this case x " = Xopumai- It is worth noting that Wagner and Yuan [10] related 
the two main approaches by showing that Martos's algorithm is equivalent to Charnes and 
Cooper's method in the sense that both algorithms lead to an identical sequence of pivoting 
operations. Bitran and Magnanti [1] have extended the connection between these approaches 
by relating them to generalized programming. No theoretical or empirical evidence has been 
given in the past indicating which of the several existing algorithms is preferred. 

689 



690 G.R. BITRAN 

In this note we present the resuhs, in number of simplex-like pivots, of twenty problems 
of type (P), randomly generated, solved by the following six algorithms (each problem when 
solved by each of the six procedures was started with the same basic feasible solution): 

A) Maximize n{x) over the feasible set obtaining the optimal solution x*. Next, apply 
Isbell and Marlow's algorithm with x° = x*. 

B) Minimize dix) over the feasible set obtaining the optimal solution x*. Next, apply 
Isbell and Marlow's algorithm with x° = x*. 

C) Maximize g(x) = [c — (cd/dd)d]x over F obtaining the optimal solution x*. Next, 
apply Isbell and Marlow's algorithm with x" = x* (Bitran and Novaes [2] suggested 
the objective function g(x)). 

D) Isbell and Marlow's algorithm. 

E) Martos's algorithm. 

F) The author considered it relevant to compare these algorithms with the number of 
pivots necessary to solve the linear programs: 

(LP) ma\{n(x) - vd(x): x^F} 

where for each of the twenty problems (P), v is chosen as its optimal value. The optimal value 
of (LP) is zero and any solution to (LP) is optimal in the fractional program (P) ([1]). Note 
that (LP) corresponds to (LP^^) with x'^ = Xopijmai- 

The characteristics of the data of the twenty randomly generated problems are the follow- 
ing: 

«=40, m=20, the absolute value of each a,^, the (/J)th element of each matrix A was 
randomly generated in the interval (0,10]. The density of negative elements being 20%. Each 

n 

component b, i = \, 2, . . . , w of each right hand side b was defined as ^ a,j/2. The objec- 

7 = 1 

tive function coefficients Cg, Cj, dg, djj = l, 2 .. n were generated in the intervals [—1000 
< Cg, Cj < 1000; < dg, dj < 1], [1 < Cg, Cj < 1000; 1 < dg, dj < 2] or [-1000 < c„, 
Cj < — 1; 1 < dg, dj < 2]. The reason for choosing such intervals was to obtain five problems 
with an angle 9 between the gradients of the numerator and denominator, i.e., 

TT IT 



eos 9 - -^, in each of the four intervals 

llcll WdW 



«'f 



4 ' 2 



TT 377 

2 ' 4 



37T » 



m an 



attempt to identify a correlation between the algorithms tested and the geometry of linear frac- 
tional programs. The geometric properties of problem (P) are consequences of the following 
facts. 

i) The hyperplanes n{x) — Ld{x)=0 contain for each L both the sets [x^R": 
fix) = L) and CE = {xeR": n(x) = and dix) = 0}. The set CE is called the 
center of the problem because as L varies the hyperplanes rotate about it giving a 
"star" centered at CE ([2]). 

ii) The objective function fix) is pseudo-concave and quasiconvex on the set [x^R": 
dix) > 0}, i.e., fiy) > fix) if and only if V/(x) (>' - x) > 0. 



EXPERIMENTS WITH LINEAR FRACTIONAL PROBLEMS 



691 



In R^ the geometry of (P) ([2]) suggests that procedure (C) would perform better than 
(A) and (B) for high and low values of 0(0€[O,7r]). Table 1 shows the results obtained. For 
the first and last five problems a total of 178 pivots was necessary with procedure (C) while 233 
and 363 pivots were required with procedures (A) and (B) respectively. The corresponding 
standard deviations being 3.70, 6.01 and 7.90. For the twenty problems selected Martos's algo- 
rithm performed better than the preceding four and in some cases required fewer pivots than 
procedure (F). Algorithms (C) and (D) were practically equivalent and were followed by (A), 
while (B) performed poorly. The computer code used to solve the twenty problems by the six 
algorithms was an adaptation of Burroughs's commercial code TEMPO. 



TABLE 1 



Problem 
Number 


A 


B 


C 


D 


1 

E 


F 


cos 


1 


24 


21 


13 


11 


12 


12 


.873 


2 


21 


34 


18 


18 


15 


15 


.858 


3 


23 


34 


12 


12 


10 


10 


.819 


4 


19 


39 


21 


21 


18 


17 


.770 


5 


32 


46 


22 


23 


21 


19 


.730 


Mean 


23.8 


34.8 


17.2 


17.0 


15.2 


14.6 




Standard 
















Deviation 


4.44 


8.18 


4.07 


4.77 


3.97 


3.26 




6 


22 


32 


20 


21 


15 


15 


.569 


7 


25 


57 


28 


23 


18 


18 


.500 


8 


19 


16 


16 


16 


15 


15 


.370 


9 


22 


51 


18 


20 


11 


11 


.132 


10 


19 


39 


18 


26 


20 


16 


.076 


Mean 


21.4 


39.0 


20.0 


21.2 


15.8 


15.0 




Standard 
















Deviation 


2.24 


14.46 


4.19 


3.31 


3.06 


2.28 




11 


12 


38 


18 


18 


11 


10 


-.103 


12 


21 


47 


21 


21 


19 


20 


-.289 


13 


18 


33 


20 


22 


20 


21 


-.424 


14 


18 


36 


20 


20 


19 


22 


-.485 


15 


33 


50 


31 


25 


26 


19 


-.613 


Mean 


20.4 


40.8 


22.0 


21.2 


19.0 


18.4 




Standard 
















Deviation 


6.94 


6.55 


4.60 


2.31 


4.77 


4.32 




16 


19 


51 


17 


17 


15 


15 


-.720 


17 


16 


30 


13 


12 


13 


16 


-.747 


18 


16 


39 


20 


21 


13 


15 


-.820 


19 


33 


33 


22 


23 


21 


24 


-.840 


20 


30 


36 


20 


24 


18 


19 


-.874 


Mean 


22.8 


37.8 


18.4 


19.4 


16.0 


17.8 




Standard 
















Deviation 


7.25 


7.25 


3.14 


4.41 


3.10 


3.43 




Total # of 
Iterations 


442 


762 


388 


394 


330 


329 




Mean 


22.1 


38.1 


19.4 


19.7 


16.5 


16.4 




Standard 
















Deviation 


5.75 


9.88 


4.42 


4.20 


4.07 


3.79 





692 



G.R. BITRAN 



Ui 
_) 

< 



u u. 


oo 


CM 


^ 


Q u. 


CM 
so 




so 


Q lU 


CTh 


f*^ 


" 


y u. 


oo 


f-i 


'^ 


U UJ 


s 


<> 

■* 


CM 


O Q 


o 


In 


SO 


m u. 


o 


;q 


V 


03 u 


o 


■^ 


V 


03 Q 


2 


o' 


V 


ca u 


§ 


i 


fN 

V 


< u. 


NA 


i 


rsi 

V 


< UJ 


r^ 


rn 


V 


< D 


so 

SO 


^ 
-* 


so 


< U 


o 


<> 

t 


so 
rsi 


< £0 


CXI 




V 


1 S 

O P 


^ 


b 


3 

I 



CQ 
< 



UJ u_ 


All 


oo 


CM 


oo 


Q u. 


Q 
a. 
All 

k. 

a. 


sO 




= 


Q tu 


a. 
All 
kj 

a. 


tr 


r^ 


V 


U 1- 


a. 

All 

k, 

a 


i 


rn 


V 


O UJ 


a. 
All 

a 


5 




V 


U Q 


Q 
a. 

All 

a 


o 


rsi 


oo 


CQ U_ 


03 

a 

All 

k, 

a 


o 


r^ 


V 


03 UJ 


=0 

a. 
All 
kj 

a 


o 


r^ 


V 


CD Q 


oa 
a 

All 

Q 
a 


o 

OS 




V 


CD U 


00 

a 

All 

4 


o 

OS 


5 


V 


< u. 


a 

All 

a. 


sO 


OS 


V 


< UJ 


a 

All 

kj 

a 


oo 




V 


< D 


a 

All 

Q 
a. 


sO 
OS 
sO 


OS 


o 


< (J 


a 

All 

a 


o 


OS 


f*> 


< CO 


a 
All 

T 

a 


oo 


<> 


V 


II 


Null 

hypotheses 

lesied 


^ 


b 


8 



EXPERIMENTS WITH LINEAR FRACTIONAL PROBLEMS 693 

To test if the observed difTerences in the number of iterations between algorithms is sta- 
tistically significant, we performed Wilcoxon's signed rank test [7]. The test was used to com- 
pare the algorithms pairwise. The null hypothesis is that the distributions of the number of 
iterations required by the pair of algorithms being tested are identical. Table 2 shows the 
results obtained. The first row in the table indicates the algorithms being compared. W is the 
Wilcoxon statistics, o- is its standard deviation, and a is the smallest level of significance for 
which the null hypothesis is rejected in a two-sided symmetrical Wilcoxon's test (a represents 
the sum of the two tails in the test). As an example, when comparing algorithms C and E the 
null hypothesis is rejected for any significance level greater than .2%. The values <.2 in the 
last row of the table indicate that the value of a for the corresponding tests is smaller than .2%. 
The results in Table 2 suggest that the distributions of the number of iterations required by 
algorithms C and D and E and F are not significantly different. 

A chi square test performed to test the null hypothesis that the distribution of the number 
of iterations for each algorithm can be approximated by a normal distribution showed that the 
null hypothesis cannot be rejected with a confidence level of .995. Under the assumption that 
the distributions of the number of iterations required by two algorithms X and Y are normal, 
Wilcoxon's test can be used to compare the means fix and fiy. The results of these tests are 
given in Table 3. W and a are respectively the Wilcoxon statistic and its standard deviation, 
a, in the last row of the table, is the smallest level of significance for which the null hypothesis 
is rejected in a one-sided test. The null hypothesis in all tests where algorithms E and F are 
compared with A, B, C, and D are rejected at very low levels of significance. 

REFERENCES 

[1] Bitran, G.R. and T.L. Magnanti, "Duality and Sensitivity Analysis for Fractional Pro- 
grams," Operations Research 24, 675-699 (1976). 

[2] Bitran, G.R. and A.G. Novaes, "Linear Programming with a Fractional Objective Func- 
tion," Operations Research 21, 22-29 (1973). 

[3] Bradley, S.P. and S.C. Frey, Jr., "Fractional Programming with Homogeneous Functions," 
Operations Research 22, 350-357 (1974). 

[4] Charnes, A. and W.W. Cooper, "Programming with Linear Fractional Functionals," Naval 
Research Logistics Quarterly 9, 181-186 (1962). 

[5] Isbell, J.R. and W.R. Marlow, "Attrition Games," Naval Research Logistics Quarterly 3, 
71-93 (1956). 

[6] Martos, B., "Hyperbolic Programming," Naval Research Logistics Quarterly 11, 135-155 
(1964). 

[7] Mosteller, F. and R.E.K. Rourke, Sturdy Statistics, Nonparametric and Order Statistics 
(Addison-Wesley Publishing Company, Inc., Reading, MA, 1973). 

[8] Schaible, S., "Parameter-Free Convex Equivalent and Dual Programs of Fractional Pro- 
gramming Problems," Zeitschrift fur Operations Research 18, 187-196 (1974). 

[9] Stancu-Minasian, I.M., "Bibliography of Fractional Programming 1960-1976," Preprint No. 
3, February 1977. Academy of Economic Studies, Department of Economic Cybernet- 
ics, Bucaresti, Romania. 
[10] Wagner, H.M. and J. S.C. Yuan, "Algorithm Equivalence in Linear Fractional Program- 
ming," Management Science 14, 301-306 (1968). 



THE SENSITIVITY OF FIRST TERM NAVY REENLISTMENT 
TO CHANGES IN UNEMPLOYMENT AND RELATIVE WAGES* 

Les Cohen 

Government Services Division 

Kenneth Leventhal & Company 

Washington, D.C. 

Diane Erickson Reedy 

Mathtech, Inc. 

A Division of Mathematica, Inc. 

Rosslyn, Virginia 

ABSTRACT 

Multiple regression analysis was used to analyze newly developed twenty 
year time series of first term reenlistment rates for nine major Navy occupa- 
tional categories. Results indicate that there are significant differences among 
the occupational categories in the determinants of their reenlistment behavior. 
More importantly, it is apparent that reenlistment rates are highly sensitive to 
current unemployment and especially unemployment about the time of enlist- 
ment. By comparison, relative wages (measures of military versus private sec- 
tor rates of compensation) are relatively insignificant and appear powerless to 
control reenlistment in the context of normal fluctuations in economic activity. 



I. INTRODUCTION 

This paper reports the results of an analysis of first term reenlistment over the past twenty 
years. The study's principal objectives were to determine the uniqueness of reenlistment 
behavior in the Navy's major enlisted occupational categories, and in the process to measure 
the sensitivity of reenlistment to economic fluctuations and changes in military versus private 
sector rates of compensation. 

In addition to the war in Viet Nam and a large number of major social, political and tech- 
nological developments, the past 20 years (1958-1977) have seen highly varied economic 
activity. Following a long period of recovery through the Kennedy-Johnson-Viet Nam era, 
there have been two dramatic, successive recessions since 1969. Over the past twenty years, 
unemployment rates ranged from 3.5 percent to almost 9 percent, averaging 5.5 percent with a 
standard deviation of 1.4 percent. 

Against this background of economic fluctuations, first term reenlistment rates for each of 
the nine major Navy occupational categories which were studied demonstrated a bimodal or 
saddle-shaped pattern, with a mild rise during the early 1960's and considerably more dramatic 



•This research was supported by the Office of the Chief of Naval Operations, Systems Analysis Division, under contract 
N00074-78-C-0073 with Information Spectrum, Inc., Arlington, Virginia. 

695 



696 L. COHEN & D.E. REEDY 



increases in the early to mid-1970's. Reenlistment rates over the twenty years, on the average, 
ranged between 10 and 50 percent.* 

From individual monthly editions of NAVY MILITARY PERSONNEL STATISTICS, 
numbers of first term eligibles and reenlistments were recorded for each of the major occupa- 
tional categories reported in a given month. These categories ranged from a maximum of 28 to 
as few as 19. The need for consistency over the twenty year sample dictated the collapsing of 
these groups into 17 occupational categories for which data was present throughout the time 
series. These were in turn combined into the nine occupational groups on which the study 
focused. (Apprentice categories were added to their journeyman counterparts. Precision 
Equipment was combined with Electronics, and Dental with Medical.) 

A reduction in the number of categories was effected to increase the size of the statistics 
reported for each group and to minimize spurious movements in the reenlistment rates which 
would obscure the meaning of experimental findings. To further improve the quality of data, 
monthly observations were converted to quarterly, again for the purpose of increasing the 
number of eligibles to reasonable levels and to smooth out the time series to render real trends 
more readily intelligible. The resultant data base for nine Navy enlisted occupations contained 
reenlistment rates for 80 calendar quarters covering the twenty years from 1958 through 1977. 

II. METHODOLOGY 

The methodology which the study employed was multiple regression via ordinary least 
squares. The same equation was estimated for each of the nine occupational categories. It was 
decided that differences among occupations would be deduced from comparisons of individual 
variable performances and, to a much lesser extent, from the R^ statistics. No attempt was 
made to estimate the most effective equations for each occupation. Instead, variables and 
transformations were selected based on their frequency of significance and general impact across 
all occupations evaluated collectively. 

In all experiments, the dependent variable was the simple reenlistment rate, computed as 
in NAVY MILITARY PERSONNEL STATISTICS as the ratio of reenlistments to eligibles. 
Five types of independent variables were regressed against these reenlistment rates: 

1. Constants, simple and seasonal 

2. War Variables, constants and casualty counts representing the immediate, current 
period impact of the Viet Nam War 

3. Motivational Variables describing the influence of the Draft and economic conditions 
at the time of enlistment 

4. Current Economic Conditions in the Private Sector, aggregate and for occupation and 
industry labor market strata 

5. Relative Wages, military versus private sector rates of compensation. 

Plots generated for selected occupational categories revealed six observations, concen- 
trated in the early phases of the sample, lying considerably above the normal range of values. 



*A1I reenlistment rates studied pertained strictly to the population of recruits remaining in service for their full terms, 
excluding all individuals who previously separated. No consideration was given to rates of attrition or their implications 
for the attributes of remaining eligibles. 



SENSITIVITY OF NAVY REENLISTMENT TO UNEMPLOYMENT AND WAGES 697 



Upon further investigation, it was determined that these outHers were related to involuntary 
extensions and early separation programs, the effects of which differed noticeably among occu- 
pations and from period to period within a single occupation time series.* Rather than reducing 
the sample size, outliers were replaced with the average of the rates from the preceding and fol- 
lowing quarters. While seasonal changes were not taken into account, this means of adjusting 
for outliers preserved the general trends of the time series. 

The basic equation around which experimentation evolved was defined as follows: 

RR= f(AUR, RW, DRAFT, WAR, S3) 
where AUR= current unemployment rate (seasonally adjusted) 

RW= ratio of military to private sector wages 
DRAFT= induction levels at the time of enlistment 
WAR= dummy for the Viet Nam War 

S3= third calendar quarter seasonal dummy 

III. TWENTY YEAR EQUATIONS 

Table 1 contains definitions for all independent variables to which reference will be made 
throughout the remainder of the paper. Table 2 describes equations estimated for all 80 obser- 
vations. The single equation used in Table 2 is the study's baseline equation from which all 
findings are derived. t The baseline equation has nine independent variables, plus a tenth 
(DBF), for "change in definition," for Electronics and for Engineering & Hull (E&H).t 

Referring to Table 2, the statistical significance of the third quarter seasonal dummy is 
evident. Generally reenlistment rates are down by 3 to 5 points in the Summer and early Fall, 
an accounting phenomenon which results from individuals extending their terms of service for 
convenience into the Summer. Almost invariably, persons requesting these short-term exten- 
sions do not reenlist, thereby driving rates of reenlistment downward. The effect of these 
extensions is regular and of the same magnitude among the occupations. 

DRFT18 represents draft levels in units of 10,000 lagged 18 periods prior to reenlistment, 
approximately six to nine months before enlistment.** This lag was determined experimentally, 
in light of uncertainty regarding the precise personal and legal points of commitment to the 
Navy. It is reasonable to assume that this six to nine month period is associated with deferred 



'Early separation programs were instituted in 1958, 1960, and 1961. Involuntary extensions occurred in conjunction 

with Berlin (1961), Cuba (1962) and Viet Nam (1965). 

tA review of the correlation matrix for all variables indicated no signs of multicollinearity. 

tAs of the fourth calendar quarter of 1973, a change in BUPERS policy affected reenlistment rates in ratings with six 

year obligors. The effect of this procedural change fell primarily upon the Electronics and E&H categories which have 

heavy concentrations of long-term training programs. As a result, beginning in October 1973 Electronics and E&H first 

term reenlistment rates are artificially inflated. 

To compensate, the DEF dummy variable was added to the Electronics and E&H equations for the twenty year 
sample. As Table 2 indicates, DEF was significant for both occupations, indicating average overstatements of reenlist- 
ment rates of 27 and 10 percentage points for Electronics and E&H respectively over the period for which DEF was in 
effect. 

**With minor exceptions, the preponderance of Navy recruits covered by the sample enlisted for four years. Data 
prohibited the isolation of persons entering under three or six year programs. 



698 



L. COHEN & D.E. REEDY 



TABLE 1 . Variable Definitions 



Dependent Variable 



Independent Variables: 



Reenlistment Rate = Ratio of Reenlistees 

to Eiigibles (e.g., .56 = 56% reenlistment) 

C Constant 

AUR Aggregate Seasonally Adjusted Unemployment Rate 

(e.g., .06 = 6% unemployment) 

ARAUR Two Year Average Quarterly Rate of Change in AUR 

(e.g., -.07 = -7% average rate of decline in AUR) 

AUR13 Unemployment Rate 13 Periods Prior to Reenlistment 

(e.g., .06 = 6% unemployment) 

RW Relative Wages (E-4 Base Pay to Private Sector Earnings) 

(e.g., .45 indicates E-4 pay is 45% of private sector wages) 

ARATE Average Grade of Eiigibles 

(e.g., 4.01 indicates average one point above E-4) 

WAR Viet Nam War Dummy (1/1968 - 4/1972) 

DRFT18 Draft Levels 18 Periods Prior To Reenlistment 

(Inductees x 10"^) (e.g., .96 = 96,000 inductees) 

S3 3rd Calendar Quarter Seasonal Dummy 

DEF Dummy For Change in Definition 

(e.g., 6-year obligors; Electronics and E&H only) 



TABLE 2. Total, Twenty Year Sample 
Determinants of Navy Reenlistment (Quarterly: 1158-4177) 
Coefficients (t-Statistics) for all Variables 





Independent Variables 


C 


AUR 


ARAUR 


AUR13 


RW 


ARATE 


WAR 


DRFT18 


S3 


DEF 


r' 


D-W 


Deck 


4-0.52 


-^3.42 


-0.35 


4-2.00 


4-0.54 


-0.19 


+ 0.02 


-0.01 


-0.03 




.74 


1.36 




(2.13) 


(4.85) 


(1.77) 


(2.63) 


(6.16) 


(2.74) 


(0.91) 


(3.49) 


(2.65) 








Ordnance 


-0.36 


44.68 


-0.66 


-1-2.21 


+ 0.69 


40.01 


+ 0.08 


-0.01 


-0.05 




.75 


1.19 




(1.05) 


(4.73) 


(2.41) 


(2.11) 


(5.67) 


(0.15) 


(2.50) 


(3.69) 


(2.96) 








Electronics 


-1.36 


4 2.24 


40.33 


42.24 


-hO.84 


+ 0.27 


+ 0.17 


-0.01 


-0.05 


+ 0.27 


.88 


1.01 


& Prec. Equip. 


(2.76) 


(1.48) 


(.394) 


(1.38) 


(3.33) 


(1.96) 


(3.48) 


(1.89) 


(2.05) 


(4.77) 






Administration 


-f-0.45 


-1-3.37 


-0.36 


40.92 


-hO.26 


-0.12 


-0.05 


-0.00 


-0.05 




.74 


1.63 




(1.93) 


(4.93) 


(1.91) 


(1.28) 


(3.04) 


(1.80) 


(2.31) 


(1.20) 


(3.86) 








Seaman 


4-0.31 


-1-1.84 


-0.04 


40.62 


-0.02 


-0.07 


-0.08 


+ 0.00 


-0.02 




.63 


1.42 




(1.40) 


(2.86) 


(0.21) 


(0.90) 


(0.21) 


(1.17) 


(3,51) 


(1.53) 


(1.48) 








Engineering & Hull 


-h0.09 


+ \.2\ 


-0.02 


-^0.95 


-1-0.28 


-0.02 


-0.03 


-0.00 


-0.04 


+ 0.10 


.81 


1.62 




(.43) 


(1.86) 


(.10) 


(1.32) 


(2.47) 


(.34) 


(1.39) 


(1.64) 


(3.14) 


(3.79) 






Construction 


-0.83 


-0.47 


-0.03 


-1-2.78 


-h0.18 


+ 0.21 


-0.20 


+ 0.01 


-0.03 




.69 


1.12 




(2.64) 


(0.52) 


(0.10) 


(2.88) 


(1.61) 


• (2.31) 


(6.46) 


(3.59) 


(1.76) 








Aviation 


-0.00 


4-3.69 


-0.17 


-1-1.47 


4-0.28 


-0.03 


-0.04 


-0.00 


-0.03 




.73 


1.16 




(0.00) 


(5.06) 


(0.84) 


(1.91) 


(3.08) 


(0.49) 


(1.62) 


(1.54) 


(2.61) 








Medical 


4-0.13 


-H4.05 


-0.45 


-1-0.65 


-0.45 


-0.30 


-0.05 


-0.01 


-0.05 




.51 


0.94 


& Dental 


(0.41) 


(4.31) 


(1.73) 


(0.65) 


(1.73) 


(0.30) 


(1.54) 


(1.32) 


(2.90) 









Significance: For 30 or more degrees of freedom — 90% level: t ^ 1.65 

- 95% level: t > 1.96 



SENSITIVITY OF NAVY REENLISTMENT TO UNEMPLOYMENT AND WAGES 699 



entrance and/or elapsed time between the decision and act of enlistment. DRFT18 was 
included to indicate the proportion of Navy eligibles who may have enlisted under pressure of 
the Draft. Presumably, draft-motivated enlistees were less prone to reenlist than their counter- 
parts who selected service in the Navy without otherwise being compelled by the threat of 
induction. Except in the case of Construction, DRFT18 coefficients were negative, though very 
small and significant for only four occupations. Draft motivation among Navy enlistees does 
not appear to have been, or promise to be of major importance for first term retention. 

The Viet Nam War dummy (1968-1972, all inclusive) displays unexpected differences in 
sign among the five categories for which it is significant. Intuitively, a negative coefficient 
makes sense. War is dangerous and military service in time of war is by definition a hazardous 
vocation. Alternatively, more challenging assignments, greater rates of advancement, and 
perhaps also a heightened sense of purpose may have caused the positive WAR coefficients for 
Ordnance and Electronics groups. 

It is important to note that the WAR dummy variable was substantially more effective 
than an alternative which measured all-service casualty counts. While casualties were present as 
early as 1961, the figures increased dramatically in the late 1960's, roughly in concert with the 
anti-War movement. It may be the effect of that movement upon attitudes toward the military, 
rather than the war itself, which the WAR dummy variable is capturing. 

Of central importance are the three unemployment variables in Table 2 which define 
national labor market activity at the time of enlistment and reenlistment. AUR is the national 
aggregate unemployment rate and is representative of the availability of private sector employ- 
ment opportunities and of the difficulty and uncertainty associated with finding employment. 
As anticipated, AUR is significant, with large positive coefficients for seven of the nine occupa- 
tions. Post-recession recoveries typically entail 3 to 4 points reduction in unemployment over 2 
to 3 years. Taken literally, the AUR coefficients suggest these recoveries may precipitate reduc- 
tions of 15 to 25 points in reenlistment rates. 

As was generally true for tests of all variables in the equation, no exponential, logarithmic 
or other specification of AUR proved as effective as the untransformed variable. Experimenta- 
tion with polynomial distributed lag functions was unproductive. No industry or occupation 
unemployment rates were as effective explaining reenlistment as aggregate AUR. Using 
national data, correlations among these various rates of unemployment are in the high 90 per- 
centiles. Only local or regional statistics will show significantly different cyclical phasing for the 
different market strata to which different occupational groups might be sensitive. 

The purpose of the ARAUR variable is to provide the equation with a measure of dynam- 
ics. ARAUR is the average quarterly rate of change in AUR calculated over the previous six 
quarters, the period determined experimentally to be the most effective.* The more rapidly the 
unemployment rate is changing, the less likely the individual is to perceive or believe what is 
happening. For a given unemployment rate, the more rapidly the rate has changed to assume 
its current value, the higher (for AUR falling) or lower (for AUR rising) the reenlistment rate. 
This is thought to be the reason for the negative ARAUR coefficients. 

At least theoretically, an individual's reenlistment decision should be based upon his 
expectations of future private sector economic conditions. His sense of what he can earn in the 
private sector in both the near and distant future will be important. Short-term private sector 
earnings are his "opportunity cost," income he must forgo for Navy training and job experience. 



'Experimentation was deemed appropriate because of uncertainty regarding the enlistee's time horizon. 



700 L. COHEN & D.E. REEDY 

Long-term prospects in the private sector will characterize the rate of return he calculates for 
his investment in the Navy. In general, his expectations depend upon the current condition of 
the economy, plus the trend of developments in the economy which he will interpret, perhaps 
simply extrapolate into the future. Comparing two periods in which th'e unemployment rate is 
the same, tentative reenlistees should be relatively optimistic or pessimistic about their pros- 
pects in the private sector depending on whether unemployment has been falling or rising. A 
priori, dynamics were expected to be a significant factor for most, if not all occupations. 

Two major difficulties have probably inhibited the effectiveness of ARAUR which was 
significant in only three instances. One is that all such variables combine notions of speed with 
direction. Merging these two aspects of dynamics may be confusing if they solicit altogether 
different reactions from the tentative reenlistee. The second problem with dynamics is that the 
importance of information about trends diminishes at very high and very low values of AUR. 
When unemployment nears its high value in the individual's memory, its probability of falling 
in the future increases dramatically. Likewise, very low rates of unemployment will be 
expected to rise simply as an exponential function of the period over which they have persisted. 
Dynamics are probably most informative when unemployment rates are in their moderate 
range, when the future is more in doubt. 

Perhaps an overriding consideration is that enlistees' information about economic condi- 
tions is derived primarily from communications from their points of origin and duty stations. 
These decidedly local data describe economic conditions and relevant industry and occupational 
market activities in the community and region in which the tentative reenlistee will consider 
settling. No one actually obtains employment in the national economy. National economic 
conditions, to which AUR refers, are in fact often unrelated to the level and dynamics of an 
individual area economy, a point which must be kept in mind when interpreting the significance 
of all the unemployment and relative wage variables. 

It is interesting to note that the unemployment rate for 20-24 year olds was noticeably less 
effective than the aggregate rate of unemployment (AUR) which encompasses all age groups. 
One explanation is that data on the 20-24 year old cohort is characterized by labor market con- 
ditions in many of the major metropolitan areas, including those with large low income popula- 
tions, while the Navy traditionally draws from more rural and less metropolitan areas with pro- 
portionately greater numbers of lower-middle and middle income families. 

The significance of the unemployment variable, AUR13, should be especially sensitive to 
discrepancies between composite national and relevant local economic indicators. By the time 
his first term is nearing expiration, the individual is relatively distant from his home environ- 
ment and even somewhat isolated from his duty station. At the outset of his term during the 
period to which AUR13 refers, the individual has just left his home economy. He is as 
knowledgeable about that local economy as he will ever be. Ideally, the equations should refer- 
ence local economic conditions at the time of enlistment, rather than AUR13 which is a 
national unemployment rate. 

As shown in Table 2, AUR13 — unemployment lagged 13 periods prior to reenlistment— 
is significant in four instances. Most importantly, its coefficients are positive. AUR 13 is 
believed to have bearing on the enlistee's motivation and propensity to reenlist in two respects. 
Unemployment (economic conditions) about the time of enlistment establishes the climate in 
the context of which individuals, of significantly different purpose and motivation, decide to 
enlist. Alternatively, unemployment about the time of enlistment describes the background 
against which enlistees make definite, though tentative, career decisions very early during their 



1 



SENSITIVITY OF NAVY REENLISTMENT TO UNEMPLOYMENT AND WAGES 701 

first terms, decisions which are either consistent or inconsistent with reenHstment three or four 
years later. 

Before experimentation was undertaken with "motivational" unemployment rates such as 
AUR13, two hypotheses were formulated to anticipate their performance in the equation. 
From other studies in which the authors were engaged, preliminary evidence had been 
developed that enlistees are either job or training oriented and will react accordingly in different 
ways to changes in the economy. The latter group is less inclined to reenlist. Their tendency is 
to view the Navy as a paid vocational college where they can effect increases in their human 
capital in preparation for returning to the private sector. Job oriented individuals who are more 
concerned with immediate employment do not see the same private sector alternatives, and are 
more likely to be attracted to the Navy for its long-term career potential. 

Depending upon their orientation, these groups are believed to react to the Navy in oppo- 
site directions in response to economic fluctuations. The training oriented group is more prone 
to enlist when the economy is strong, attracted by advertised private sector positions which 
require skills and experience readily available from the Navy. More importantly, the composi- 
tion of enlistees changes when the economy is strong because the job oriented population need 
not rely as heavily upon the Navy as an alternative employer. Job oriented persons will favor 
enlistment when the unemployment rate is high, while sophisticated training oriented types 
should have greater success protecting their employment status and income. The latter are 
more employable at enlistment and four years later when their first term is completed. When 
unemployment is high, the enlisted population is more job oriented and will be characterized by 
a higher first term reenlistment rate, suggesting a positive, significant coefficient for unemploy- 
ment rates just prior to enlistment. 

Interestingly, experimentation established the clear superiority of AUR13 — six to nine 
months after enlistment — over any lagged unemployment variable going back to, or prior to 
the period of enlistment. This association with unemployment in the third quarter after enlist- 
ment is supportive of a second hypothesis. Early into his first term, the typical enlistee is form- 
ing opinions about the Navy and is making conscious career decisions regarding training and the 
degree of his commitment to potentially long-term service. It is at this time that conditions in 
the private sector (probably in his home town) are taken into account.* If his experiences have 
been generally negative, he may look more favorably upon the private sector. If the economy 
is healthy, he may decide not to work to enhance his status in the Navy beyond what he consid- 
ers necessary to maximize his success upon return to the private sector. Having never fully 
committed himself to the Navy, the enlistee never seriously considers extending his service 
through a second term. 

In contrast to unemployment variables to which policymakers can only react, RW (relative 
wages) is a parameter over which the Navy has direct control. (The relative wage variable was 
calculated with reference to E-4 base pay, excluding indirect benefits and bonuses, lump sum or 
installment. RW is the ratio of E-4 base pay to the earnings of private nonagricultural, nonsu- 
pervisory production workers.) Relative wages were significant for all but Seaman and Con- 
struction personnel, with predictably positive, though small coefficients. The implication is that 
relative wages, exclusive of bonuses or benefits, are ineffective as a policy variable to cause 
independent changes in reenlistment or to combat the effects of an improving economy. 



*During their first six months of service, recruits have two weeics leave, an early opportunity to evaluate their enlist- 
inent decisions at home in the company of family and friends, and with an unobstructed view of local private sector la- 
bor market conditions. 



702 L. COHEN & D.E. REEDY 

Even more troublesome regarding the efficacy of relative wages as a policy instrument is 
that the meaning of the RW variable is suspect. Values for RW trace out a low grade exponen- 
tial curve which, especially in recent years, parallels quality of life improvements which have 
been implemented for enlisted personnel. RW is the only variable in the equation which fol- 
lows this general form and may simply be serving as a proxy for other factors favorable to reen- 
listment which have not, and probably could not be captured by the equations.* 

IV. EARLY AND RECENT SAMPLES 

The dynamics of the national economy cTianged after 1970. Following almost a decade 
during which the economy experienced continued improvement, the 1970's brought two sharp 
recessions. These fluctuations occurred in the context, and perhaps to some extent as a result 
of a barrage of new socio-political and technological phenomena and events which grew out of 
the Viet Nam Conflict and coincided with the maturing of post-World War II baby-boom labor. 
Even the legendary work ethic which has supposedly sustained the character of the American 
economy since its inception began to suff'er a noticeable loss of popularity. 

It can be assumed that, against this background of complex and rapid change, a new busi- 
ness cycle and labor force mentality have emerged from the late 1960's. Attitudes toward the 
military cannot have been unaffected. It follows that the relationships between reenlistment 
and its determinants, especially unemployment and relative wages, may have also been altered 
and diff'er now from what they were a decade ago. More than likely, a single model describing 
the complete twenty year time span will not be appropriate for projecting reenlistment behavior 
in the near future. Reenlistment over the next five to 10 years will probably be more con- 
sistent with its history during the very recent past, beginning in the late 1960's, early 1970's. 

To test for the existence of unique recent period relationships, the 80 quarter reenlistment 
time series was split into two 10 year samples, 1/1958-4/1967 and 1/1968-4/1977. f The results 
of the earlier sample are shown in Table 3. Compared to the equations based on all 80 observa- 
tions, these early sample equations are obviously less effective. In addition to the seasonal 
dummy, the equations are dominated by the unemployment rate variables, especially current 
period AUR. Neither the Draft variable nor RW (relative wages) was significant for a single 
occupational category. With the exception of the Aviation and Medical categories, the generally 
low R^ statistics indicate that the early sample equations are improperly, or more likely 
underspecified. DRFT18 and RW might be significant in the context of a more explicit, more 
complete model.* The point remains, however, that the same factors (with the exception of 
the WAR variable which was not defined prior to 1968) which explain reenlistment to a reason- 
able degree of effectiveness over twenty years have failed to repeat that achievement for a 
shorter time frame. 

The results of the later sample are shown in Table 4. Most notable, as compared to the 
earlier sample, are the R^ and constant terms.** The R^ statistics are impressive, and the 



*ARATE, the average rate of first term eligibles, exhibits a gradual, continual increase over time. ARATE was intro- 
duced to recognize differences in attitudes and earnings among enlistees which would be a function of levels of 
achievement. In fact, ARATE may be driven by reenlistment rates as a result of changes in promotion policies, 
t Analysis of correlation matrices indicated that multicollinearity was not a problem in either of these sample periods. 
tThe insignificance of the Draft variable (DRFT18) in the early equations probably derives from the peacetime period 
for which the variable was relevant. DRFT18 describes motivation at the time of enlistment, and therefore measures 
peacetime levels of inductees, 1953-1963. 

"The WAR variable, absent in the earlier sample, was not responsible for the generally remarkable performance of the 
recent sample equations. This conclusion was substantiated by tests which estimated equations for the recent sample 
without the WAR variable present. Changes in results were nominal, with only the Durbin-Watson statistics showing 
any appreciable deterioration. 



SENSITIVITY OF NAVY REENLISTMENT TO UNEMPLOYMENT AND WAGES 



703 



TABLE 3. Early Ten Year Sample 

Determinants of Navy Reenlistment (Quarterly: 1/58-4/67) 

Coefficients (t-Statistics) For All Variables 





Independent Variables 


C 


AUR 


ARAUR 


AUR13 


RW 


ARATE 


WAR 


DRFT18 


S3 


R^ 


D-W 


Deck 


+ 0.58 


+ 2.65 


-0.25 


+ 0.56 


+ 0.35 


-0.16 


— 


-0.01 


-0.04 


.36 


1.70 




(2.00) 


(1.94) 


(0.89) 


(0.71) 


(1.30) 


(2.13) 




(1.02) 


(2.79) 






Ordnance 


-1-0.59 


-1.18 


-0.09 


+ 1.45 


-0.18 


-0.06 





-0.01 


-0.05 


.48 


2.13 




(1.86) 


(0.80) 


(0.28) 


(1.68) 


(0.61) 


(0.75) 




(1.53) 


(3.26) 






Electronics 


-0.56 


-1.10 


-0.01 


+ 3.01 


+ 0.42 


+ 0.15 





-0.01 


-0.07 


.52 


1.81 


& Prec. Equip. 


(1.29) 


(0.54) 


(0.02) 


(2.52) 


(1.03) 


(1.29) 




(1.15) 


(3.01) 






Administration 


+ 0.54 


+ 5.37 


-0.71 


-0.24 


+ 0.33 


-0.16 





-0.00 


-0.06 


.59 


2.17 




(1.81) 


(3.80) 


(2.45) 


(0.30) 


(1.17) 


(2.03) 




(0.64) 


(3.84) 






Seaman 


+ 0.20 


+ 4.26 


-0.32 


-0.28 


+ 0.19 


-0.08 





-0.00 


-0.02 


.57 


1.97 




(0.76) 


(3.43) 


(1.26) 


(0.38) 


(0.78) 


(1.20) 




(0.47) 


(1.63) 






Engineering & Hull 


+ 0.25 


+ 2.21 


-0.13 


+ 0.18 


+ 0.19 


-0.06 





-0.00 


-0.03 


.36 


2.17 




(1.08) 


(2.06) 


(0.58) 


(0.29) 


(0.87) 


(0.93) 




(0.85) 


(2.90) 






Construction 


-0.25 


+ 3.31 


-0.50 


+ 1.17 


-0.20 


+ 0.07 





+ 0.00 


-0.04 


.51 


1.52 




(0.82) 


(2.34) 


(1.72) 


(1.41) 


(0.70) 


(0.92) 




(0.60) 


(2.61) 






Aviation 


+ 0.08 


+ 5.27 


-0.14 


-0.24 


-0.02 


-0.03 





-0.00 


-0.04 


.74 


2.22 




(-.34) 


(4.53) 


(0.57) 


(0.35) 


(0.08) 


(0.45) 




(1.07) 


(3.11) 






Medical 


+ 0.22 


+ 7.69 


-0.84 


-1.48 


-0.27 


-0.05 





-0.00 


-0.06 


.77 


1.91 


& Dental 


(0.70) 


(5.16) 


(2.75) 


(1.71) 


(0.90) 


(0.55) 




(0.51) 


(4.01) 







Significance: For 30 or more degrees of freedom — 90% level: t ^ 1.65 

- 95% level: t ^ 1.96 



704 



L. COHEN & D.E. REEDY 



TABLE 4. Recent Ten Year Sample 

Determinants Of Navy Reenlislment (Quarterly: 1168-4177) 

Coefficients (t-Statistics) For All Variables 







Independent Variables 




C 


AUR 


ARAUR 


AUR13 


RW 


ARATE 


WAR 


DRFT18 


S3 


R^ D-W 




Deck 


-1.12 


+ 5.25 


-1.09 


+ 8.64 


+ 0.02 


+ 0.16 


+ 0.07 


-0.00 


1 

-0.03 


.88 


1.98 






(1.88) 


(4.56) 


(3.49) 


(5.08) 


(0.08) 


(1.13) 


(1.85) 


(1.59) 


(1.75) 








Ordnance 


-2.58 


+ 8.26 


-1.44 


+ 6.30 


-0.05 


+ 0.53 


+ 0.11 


-0.01 


-0.05 


.78 


1.42 






(2.65) 


(4.37) 


(2.82) 


(2,26) 


(0.14) 


(2.32) 


(1.70) 


(1.39) 


(1.81) 








Electronics 


-1.42 


+ 7.16 


+ 0.61 


+ 3.82 


+ 0.61 


+ 0.28 


-0.01 


-0.01 


-0.02 


.73 


1.01 




& Prec. Equip. 


(0.86) 


(2.09) 


(0.71) 


(0.82) 


(1.01) 


(0.73) 


(0.10) 


(0.50) 


(0.39) 








Administration 


-1.20 


+ 2.80 


-0.53 


+ 6.96 


+ 0.01 


+ 0.24 


-0.02 


-0.00 


-0.03 


.89 


2.03 


c 
o 




(2.35) 


(2.82) 


(2.00) 


(4.77) 


(0.08) 


(1.98) 


(0.52) 


(1.30) 


(2.08) 






1 


Seaman 


-1.38 


-0.37 


+ 0.27 


+ 6.41 


+ 0.41 


+ 0.23 


+ 0.03 


-0.00 


-0.00 


.81 


1.30 


3 

o 




(6.41) 


(0.41) 


(1.11) 


(4.80) 


(2.41) 


(2.11) 


(1.03) 


(1.57) 


(0.33) 






^-^ 


Engineering & Hull 


-1.56 


+ 1.23 


-0.24 


+ 8.03 


+ 0.33 


+ 0.27 


-0.01 


-0.07 


-0.02 


.89 


1.41 






(2.86) 


(1.10) 


(0.78) 


(4.88) 


(1.57) 


(2.08) 


(0.22) 


(1.93) 


(1.31) 








Construction 


-1.68 


-5.12 


+ 0.71 


+ 6.44 


+ 0.41 


+ 0.40 


-0.22 


+ 0.01 


-0.00 


.89 


1.72 






(2.45) 


(3.85) 


(1.98) 


(3.29) 


(1.62) 


(2.51) 


(4.90) 


(1.51) 


(1.51) 








Aviation 


-1.34 


+ 1.29 


-0.37 


+ 8.26 


+ 0.33 


+ 0.22 


+ 0.02 


-0.01 


-0.02 


.91 


2.12 






(2.75) 


(1.90) 


(1.46) 


(5.92) 


(1.85) 


(1.95) 


(0.65) 


(2.56) 


(1.44) 








Medical 


-1.68 


-1.22 


+ 0.28 


+ 8.22 


+ 0.80 


+ 0.22 


-0.08 


-0.01 


-0.01 


.79 


1.54 




& Dental 


(2.76) 


(1.04) 


(0,87) 


(4.74) 


(3.56) 


(1.90) 


(1.97) 


(2.18) 


(0.33) 







Significance; For 30 or more degrees of freedom — 90% level: t > 1.65 

- 95% level: t > 1.96 

constants generally suggestive of equations which describe different relationships than those 
captured by the early and total sample equations. Certainly, the single most important reason 
for the superior performance of the recent sample equations is the overall strength of the 
unemployment indicators, AUR, ARAUR and AUR13.* Assuming that attitudes toward mili- 
tary service have been altered, it is likely that they have changed to favor a heightened sensi- 
tivity to economic fluctuations. More and more recruits may be viewing military service strictly 
as a vocational decision. If they are training oriented, they watch the economy to discern when 
they can most efTectively capitalize on the increases in their human capital which the Navy is 
providing them. If they are job oriented and see the Navy as a fall-back alternative to private 
sector unemployment, they may remain in the Navy only until they detect better opportunities 
on the outside. Either way, job or training oriented, the enlistee will do his best to keep in 
touch with economic conditions, more now than in the 1960's when economic motivations were 
less influential relative to social and personal psychological factors. 



•The negative coefficient for AUR in the Construction equation (Table 4) may indicate that Navy Construction person- 
nel identify with the public works (infrastructure) component of the construction industry. Public works construction is 
sometimes undertaken as part of counter-cyclical employment programs, and must precede or follow residential and 
commercial/industrial development. As such, public works employment may experience cycles approaching 180 de- 
grees out of phase with other construction activities which are more closely associated with movements in the national 
unemployment rate. This assumption about Navy construction personnel is not necessarily inconsistent with the 
typically positive coefficient for AUR13. Six to nine months into his first term, the new recruit may not yet consider 
himself affiliated with the construction industry. 



SENSITIVITY OF NAVY REENLISTMENT TO UNEMPLOYMENT AND WAGES 705 

RW is significant and positive for only three occupational categories, and of minor impact 
in these instances. Redefining relative wages based upon Regular Military Compensation did 
nothing to increase the number of significant cases, nor did it affect significant RW coefficients 
substantially. Taking into account housing and subsistence allowances and the associated tax 
advantages produced results which were no more impressive. 

V. FINDINGS AND IMPLICATIONS 

For the analysis, conceptualization and timing of programs designed to anticipate or con- 
trol reenlistment, of the three models (early or recent samples, or all 80 observations) the 
recent sample equations would seem to be most pertinent for two reasons. First, economic 
conditions in the near future are more likely to resemble the early 1970's than the 1960's. 
Those social, political and economic phenomena which have precipitated the new dynamics of 
recent fluctuations are likely to persist. Second, contemporary attitudes toward military service 
are part of a generally irreversible evolution of mores and traditions. Intuitively, changes in 
attitudes should involve an increasingly heavy emphasis by recruits upon the vocational aspects 
of service. Sensitivities to, and knowledge of economic conditions should be increasing, 
perhaps as indicated by the performance of unemployment rate variables, particularly AUR13, 
in the recent sample equations. Quite probably, it is these relationships which are responsible 
for the effectiveness of equations based on the entire twenty year time series. 

Assuming that neither war nor the Draft are likely to repeat themselves in the near 
future, the only recurrent determinants of reenlistment — other than subjective factors not cap- 
tured by the equations — are unemployment (economic conditions) and relative wages. Rela- 
tive wages, now that parity between military and private sector compensation is guaranteed by 
law, are effectively constant, excluding the payment of bonuses. That leaves the economy, and 
perhaps bonuses also, as the dominant influences which will characterize reenlistment behavior 
in the near future. (Time series data on which the study was based precluded incorporating 
lump sum or installment bonuses into either the RW variable or elsewhere in the equations.) 

As Table 5 indicates, with reference only to unemployment rate variables, it is possible to 
explain from 51 to 86 percent of the variation in reenlistment rates over the past ten years.* 
The significance of the unemployment variables is particulary impressive considering: 

1. The equations include no quality of life indicators representative of improvements 
which have occurred to enlisted working conditions. 

2. No sampling has occurred to separate individuals by sex, ethnicity, family status or 
mental group. 

3. The unemployment rate data being used is national and aggregate. No direct refer- 
ences have been made to local economic conditions in the enlistee's home area or duty station. 

Without question, reenlistment rates are highly sensitive to economic conditions at reenlistment 
and enlistment, represented by unemployment rates as indicators of the availability and 
difficulty of securing private sector employment. The effect of current unemployment is quite 
strong, but the effects of unemployment about the time of enlistment, AUR13, is especially 
pronounced. 



*Some R statistics are undoubtedly biased upwards due to the presence of positive serial correlation. 



706 



L. COHEN & D.E. REEDY 



C 

« 
a. 

3 
u 
o 

o 



TABLE 5. Recent Ten Year Sample 

Determinants of Navy Reenlistment (Quarterly 1168-4177) 

Coefficients (t-Statistics) for Significant Variables 





Independent Variables 


C 


AUR 


ARAUR 


AUR13 


R2 


D-W 


Deck 


-0.30 


+ 4.77 


-0.97 


+ 6.28 


.83 


1.54 




(6.80) 


(7.41) 


(4.80) 


(6.07) 






Ordnance 




+ 7.89 
(7.27) 


-1.23 
(3.61) 




.68 


1.19 


Electronics 




+ 10.44 






.69 


0.83 


& Prec. Equip. 




(6.39) 










Administration 


-0.26 


+ 4.02 


-0.78 


+ 6.69 


.85 


1.92 




(6.92) 


(7.14) 


(4.41) 


(7.41) 






Seaman 


-0.22 
(5.85) 


+ 1.79 
(3.18) 




+ 4.77 
(5.27) 


.67 


1.12 


Engineering & Hull 


-0.30 


+ 3.88 


-0.78 


+ 7.83 


.82 


1.36 




(6.70) 


(5.84) 


(3.72) 


(7.34) 






Construction 


-0.33 
(3.70) 






+ 10.27 
(4.82) 


.51 


0.43 


Aviation 


-0.32 


+ 3.95 


-0.81 


+ 7.40 


.86 


1.70 




(8.16) 


(6.84) 


(4.48) 


(7.98) 






Medical 


-0.15 


+ 2.13 




+ 5.46 


.56 


1.03 


& Dental 


(2.75) 


(2.63) 




(4.19) 







Note; Coefficients and t-statistics are omitted for variables not significant at ttie 90% level. 
For 30 or more degrees of freedom — 90% level: t > 1.65; 95% level: t ^ 1.96. 

The effects of AUR and AUR 13 combined can be devastating for reenlistment. Compare 
two "classes" of recruits, one enlisting and coming up for reenlistment during peaks in the econ- 
omy, the other during low points. Assuming these peaks and troughs are separated by only two 
percent, six to eight percent unemployment for example, the total difference in reenlistment 
rates between the two groups could be as high as 27 to 29 percent for Deck and Ordnance, as 
low as three percent for Construction. 



Striking by comparison is the poor performance of relative wages, the variable RW. Sta- 
tistically significant for only three occupations, its effect in those cases is nominal, especially 
considering the magnitude of the unemployment rate coefficients. Calculating the rates of sub- 
stitution between unemployment and relative wages, the latter appears ineffective as a means of 
protecting reenlistment rates from a healthy, vigorous private sector. 

Comparison of equations suggests that the Navy's enlisted workforce is not a homogene- 
ous mass for which precisely the same reenlistment programs would be appropriate. Differences 
in reactions to the economy and relative wages among the nine major occupational groups 
which were studied are evident and of significant order of magnitude. The policy implication is 
that different occupations should be treated differently to effect comparable degrees of control 
over their reenlistment rates. 



SENSITIVITY OF NAVY REENLISTMENT TO UNEMPLOYMENT AND WAGES 707 

Unemployment rates, current and at the time of enlistment, are the principal deter- 
minants of reenlistment. The effects of changes in current economic conditions (AUR) are 
substantial, in the neighborhood of from 4 to 5 percentage points change in reenlistment rates 
for every 1 point change in unemployment. Moreover, the significance and power of the 
AUR13 variable is particularly important for the nature and timing of programs designed to 
increase reenlistment. It is apparent that the propensity to reenlist is to a great extent deter- 
mined very early in the first term, at or about the time of enlistment. The importance of 
AUR13 has profound implications for the incidence of recruiting expenditures and early enlist- 
ment counseling programs. Finally, although for some occupations they may be a significant 
determinant reenlistment, military relative to private sector rates of compensation are generally 
powerless to compensate for the effects of economic fluctuations. 

To the extent that bonuses paid in installments over long periods are viewed as regular 
wages, they may be ineffective as reenlistment incentives. Other than their importance for 
enlistment, relative wages can probably be allowed to deteriorate when economic conditions 
favor rising reenlistment rates.* More importantly, the order of magnitude of increases in mili- 
tary compensation — excluding bonuses — necessary to protect reenlistment rates when 
economic conditions are driving those rates down is almost assuredly financially and politically 
unacceptable. No evidence was discovered indicating that reenlistment has been affected by 
variations in benefits for housing, subsistence, or tax advantages. 

If relative wages are not effective as a program instrument to control reenlistment, the 
Navy has only two alternatives: lump-sum incentive grants which are more impressive, at least 
more visible than wage increases (or installment bonuses) and probably less expensive for a 
given impact; and procedural changes affecting the pace of promotions and intra-Navy job and 
occupational mobility so as to bring the career patterns of enlisted personnel more in line with 
those of their private sector counterparts. The latter may be especially important in light of the 
orientation toward employment opportunities and lack of concern for direct and indirect mone- 
tary compensation which this study has identified. 

Differences between Navy and private sector rates of achievement and mobility — vertical 
and horizontal — may produce a degree of discontentment and sense of falling behind the 
private sector. This problem has potentially serious ramifications for first and even second term 
reenlistment, and is clearly a policy issue deserving immediate study. Unemployment rates are 
factors to which the Navy can react or for which it can prepare, but not control. Since relative 
wages are now basically constant and if they are generally powerless to counteract economic 
stimuli, substantial lump sum bonus programs and improvements in career development paths 
may be necessary to counteract or nullify the link between business cycles and reenlistment. 



VI. RESERVATIONS AND LIMITATIONS 

There are the standard set of reservations associated with regression analysis and all 
methods of statistical inference involving correlation. Specifically with respect to the equations 
discussed above, there are a few problems. The signs of some coefficients are troublesome. 



*There is evidence that the major proportion of tentative enlistees are either oblivious or insensitive to rates of mihtary 
compensation. At most, those who do not enlist may be somewhat put off by a general impression that military pay is 
relatively low compared to the private sector. See for example, D. Grissmer, et al., "An Econometric Analysis of 
Volunteer Enlistments by Service and Cost Effectiveness Comparison of Service Incentive Programs," OAD-CR-66, 
General Research Corporation, October 1974. 



708 L. COHEN & D.E. REEDY « 

ft 

The Durbin-Watson statistics are generally indicative of serial correlation among error terms, ' 
more severe for some occupations than others. As noted above, the RW variable is suspect. 
Its mildly exponential behavior is coincidental with a number of other phenomena which might ' 
have also affected reenlistment. Despite these difficulties, the equations do well and are reliable 
for what they convey about the direction and order of magnitude of the effects of unemploy- 
ment upon reenlistment. 

The principal limitations to direct policy application of these findings derive from the sim- 
plicity of the time series data on which they are based. Without specific ratings data, reenlist- 
ment bonuses could not be taken into account. Available time series data prohibit effecting any ' 
controls for personal or socio-demographic considerations, notably ethnicity, sex, family status 
and especially mental group. The time series analysis above fails to pinpoint who is leaving and 
to explain precisely why they leave. 

Equally important, the data have prevented any reference to labor market conditions in 
either the enlistee's duty station or home economy where he might consider settling. Local 
economies differ substantially in the way they experience phases of any national cycle, 
differences which a complete analysis of reenlistment behavior must take into account. To a 
great extent, compensation for these disadvantages of time series analysis could be accom- 
plished via supplementary cross-sectional analysis of area reenlistment rates using the Enlisted 
Master File or some related set of records. 

ACKNOWLEDGMENTS 

The authors gratefully acknowledge the technical assistance of Ms. Deborah Coffin 
without whom this paper would have suffered considerable loss of substance and detail, and the 
valuable criticisms and comments of Drs. Alfred Rhode and John Martin of Information Spec- 
trum, Inc., Mr. Irwin Schiff and LCDR Kevin Delaney of OP-964D, and Mr. Samuel Kleinman 
of the Center for Naval Analyses. 

BIBLIOGRAPHY 

[1] Albrecht, M., "A Discussion of Some Applications of Human Captial Theory to Military 

Manpower Issues," P-5727, RAND, September 1976. 
[2] Bryan, J. and A. Singer, "Prediction of Reenlistment Using Regression Estimation of Event 

Probabilities," Research Contribution No. 13, Center for Naval Analyses, October 1965. 
[3] Cooper, R., "The All-Volunteer Force: Five Years Later," P-6051, RAND, December 

1977. 
[4] Enns, J., "Effect of the Variable Reenlistment Bonus on Reenlistment Rates: Empirical 

Results for FY-71," R-1502-ARPA, RAND, June 1975. 
[5] Grissmer, D., et al., "An Econometric Analysis of Volunteer Enlistments by Service and 

Cost Effectiveness Comparison of Service Incentive Programs," OAD-CR-66, General 

Research Corporation, October 1974. 
[6] Haber, S. and C. Stewart, Jr., "The Responsiveness of Reenlistment to Changes in Navy 

Compensation," TR-1254, George Washington University, May 1975. 
[7] Lindsay, W., Jr. and B. Causey, "A Statistical Model for the Prediction of Reenlistment," 

TP-342, Research Analysis Corporation, March 1969. 
[8] Lindsay, W., et al., "Simple Regression Models for Estimating Future Enlistment and 

Reenlistment in Army Manpower Planning," TP-402, Research Analysis Corporation," 

September 1970. 
[9] Lockman, R., et al., "Motivational Factors in Accession and Retention Behavior," Research 

Contribution 201, Center for Naval Analyses, January 1972. 



SENSITIVITY OF NAVY REENLISTMENT TO UNEMPLOYMENT AND WAGES 709 

[10] Massell, A., "An Imputation Method for Estimating Civilian Opportunities Available to 

Military Enlisted Men," R-1565-ARPA, RAND, July 1975. 
[11] Massell, A., "Reservation Wages and Military Reenlistments," P-55336, RAND, February 

1976. 
[12] Nelson, G., "An Economic Analysis of First Term Reenlistments in the Army," P-647, 

Institute for Defense Analyses, June 1970. 
[13] Quigley, J. and R. Wilburn, "An Economic Analysis of First Term Reenlistment in the Air 

Force," AFPDPL-PR-69-017, Personnel Research and Analysis Division, Directorate of 

Personnel Planning, USAF, September 1969. 
[14] Young & Rubicam, Inc., "Naval Retention: A Problem of Empathy," May 1970. 



INDEX TO VOLUME 26 

ALAM, K. "Distribution of Sample Correlation Coefficients," Vol. 26, No. 2, June 1979, pp. 327-330. 

AL-AYAT, R. and R. Fare, "On the Existence of Joint Production Functions," Vol. 26, No. 4, Dec. 1979,, pp. 627-630. 

ARMSTRONG, R.D. and E.L. Frome, "Least-Absolute-Value Estimators for One-Way and Two-Way Tables," Vol. 26, 

No. 1, Mar. 1979, pp. 79-96. 
BARLOW, R.E., "Geometry of the Total Time on Test Transform," Vol. 26, No. 3, Sept. 1979, pp. 393-402. 
BARZILY Z., W.H. Marlow and S. Zacks, "Survey of Approaches to Readiness," Vol. 26, No. 1, Mar. 1979, pp. 21-31. 
BAZARAA, M.S. and A.N. Elshafei, "An Exact Branch-and-Bound Procedure for the Quadratic-Assignment Problem," 

Vol. 26, No. 1, Mar. 1979, pp. 109-121. 
BERG, M. and B. Epstein, "A Note on a Modified Block Replacement Policy for Units with Increasing Marginal Run- 
ning Cost," Vol. 26, No. 1, Mar. 1979, pp. 157-160. 
BHAT, U.N., M. Shalaby and M.J. Fischer, "Approximation Techniques in the Solution of Queueing Problems," Vol. 

26, No. 2, June 1979, pp. 311-326. 
BITRAN, G.R., "Experiments With Linear Fractional Problems," Vol. 26, No. 4, Dec. 1979, pp. 689-693. 
BULFIN, R.L., R.G. Parker and CM. Shetty, "Computational Results with a Branch-and-Bound Algorithm for the 

General Knapsack Problem," Vol. 26, No. 1, Mar. 1979, pp. 41-46. 
BUTLER, D.A., "A Complete Importance Ranking for Components of Binary Coherent Systems, With Extensions to 

Multi-State Systems," Vol. 26, No. 4, Dec. 1979, pp. 565-578. 
CHANDRA, R., "On n/l/F Dynamic Deterministic Problems," Vol. 26, No. 3, Sept. 1979, pp. 537-544. 
CHARNETSKI, JR. and R.M. Soland, "Multiple-Attribute Decision Making With Partial Information: The Expected- 

Value Criterion," Vol. 26, No. 2, June 1979, pp. 249-256. 
CHAUDHRY, ML., "The Queueing System M^G/l and its Ramifications," Vol. 26, No. 4, Dec. 1979, pp. 667-674. 
COHEN, L. and D.E. Reedy, "The Sensitivity of First Term Navy Reelistment to Changes in Unemployment and Rela- 
tive Wages," Vol. 26, No. 4, Dec. 1979, pp. 695-709, 
COOPER, L. and J. Kennington, "Nonexireme Point Solution Strategies For Linear Programs," Vol. 26, No. 3, Sept. 

1979, pp. 447-461. 
CRAVEN, B.D. and B. Mond, "A Note on Duality in Homogeneous Fractional Programming," Vol. 26, No. I, Mar. 

1979, pp. 153-155. 
CURRAN, R.T., S.C. Jaquette and J.L. Politzer, "Damage Calculations for Unreliable Warheads," Vol. 26, No. 3, Sept. 

1979, pp. 545-550. 
DEGROOT, M.H., "Bayesian Estimation and Optimal Designs in Partially Accelerated Life Testing," Vol. 26, No. 2, 

June 1979, pp. 223-235. 
DERMAN, C, G.J. Lieberman and S.M. Ross, "Adaptive Disposal Models," Vol. 26, No. I. Mar. 1979, pp. 33-40. 
ELMAGHRABY, S.E. and P.S. Pulat, "Optimal Project Compression With Due-Dated Events," Vol. 26, No. 2, June 

1979, pp. 331-348. 
FISK, J.C. and M.S. Hung, "A Heuristic Routine for Solving Large Loading Problems," Vol. 26, No. 4, Dec. 1979, pp. 

643-650. 
FISK, J. and P. McKeown, "The Pure Fixed Charge Transportation Problem," Vol. 26, No. 4, Dec. 1979, pp. 631-641. 
GITTINS, J.C. and DM. Roberts, "The Search for an Intelligent Evader Concealed in One of an Arbitrary Number of 

Regions," Vol. 26, No. 4, Dec. 1979, pp. 651-666. 
GOLDEN, B.L. and F.B. Alt, "Interval Estimation of a Global Optimum for Large Combinatorial Problems," Vol. 26, 

No. 1, Mar. 1979, pp. 69-77. 
GRAVES, S.C. and J. Keilson, "A Methodology for Studying the Dynamics of Extended Logistic Systems," Vol. 26, 

-No. 2, June 1979, pp. 169-197. 
GUPTA, R.K., V. Srinivasan and P.L. Yu, "Optimal State-Dependent Pricing Policies for a Class of Stochastic Multiunit 

Service Systems," Vol. 26, No. 2, June 1979, pp. 257-283. 
HELGASON, R.V. and J.L. Kennington, "A New Storage Reduction Technique for the Solution of the Group Prob- 
lem," Vol. 26, No. 4, Dec. 1979, pp. 681-687. 
HODGSON, T.J. and G.J. Koehler, "Computation Techniques for Large Scale Undiscounted Markov Decision 

Processes," Vol. 26, No. 4, Dec. 1979, pp. 587-594. 
ISERMANN, H., "The Enumeration of all Efficient Solutions for a Linear Multiple-Objective Transportation Problem," 

Vol. 26, No. 1, Mar. 1979, pp. 123-139. 
JEFFERSON, T.R., G.M. Folie and C.H. Scott, "Duality for Quasi-Concave Programs With Application to Economics," 

Vol. 26, No. 4, Dec. 1979, pp. 611-625. 

711 



712 INDEX TO VOLUME 26 

JOSHI, P.C, "On the Moments of Gamma Order Statistics," Vol. 26, No. 4, Dec. 1979, 675-679. 

KARMARKAR, U.S., "Convex/Stochastic Programming and Multilocation Inventory Problems," Vol. 26, No. 1, Mar. 

1979, pp. 1-19, 
LEAVENWORTH, R.S. and R.L. Scheaffer, "Design of a Process Control Scheme for Defects Per 100 Units Based on 

AOQL," Vol. 26, No. 3, Sept. 1979, pp. 463-485. 
LEWIS, P.A.W. and G.S. Shedler, "Simulation of Nonhomogeneous Poisson Processes by Thinning," Vol. 26, No. 3, 

Sept. 1979. pp. 403-413. 
LUSS, H., "A Capacity-Expansion Model for Two Facility Types," Vol. 26, No. 2, June 1979, pp. 291-303. 
MISRA, R.B., "A Note on Optimal Inventory Management Under Inflation," Vol. 26, No. 1, Mar. 1979, pp. 161-165. 
MITCHELL, C.R. and A.S. Paulson, "M/M/1 Queues with Interdependent Arrival and Service Processes," Vol. 26, No. 

1, Mar. 1979, pp. 47-56. 
MUCKSTADT, J. A., "A Three-Echelon, Multi-Item Model for Recoverable Items," Vol. 26, No. 2, June 1979, pp. 

199-221. 
MURPHY, F.H. and A.L. Soyster, "Multiproduct Lot-Size Scheduling with Proportional Product Demands," Vol. 26, 

No. 1, Mar. 1979, pp. 97-108. 
NEMHAUSER, G.L. and G.M. Weber, "Optimal Set Partitioning, Matchings and Lagrangian Duality," Vol. 26, No. 4, 

Dec. 1979, pp. 553-563. 
PEGDEN, CD. and C.C. Petersen, "An Algorithm {GIPC2) for Solving Integer Programming Problems With Separ- 
able Nonlinear Objective Functions," Vol. 26, No. 4, Dec. 1979, pp. 595-609. 
PINEDO, M. and G. Weiss, "Scheduling of Stochastic Tasks on Two Parallel Processors,'-' Vol. 26, No. 3, Sept. 1979, 

pp. 527-535. 
RAMANl, K.V., "Some Bayes Tests and their Asymptotic Properties for the Multivariate, Multisample Goodness-of-Fit 

Problem," Vol. 26, No. 2, June 1979, pp. 237-247^ 
ROSENBERG, D., "A New Analysis of a Lot-Size Model With Partial Backlogging," Vol. 26, No. 2, June 1979, pp. 

349-353. 
ROSS, S.M. and J. Schechtman, "On the First Time a Separately Maintained Parallel System has beer\ Down for a 

Fixed Time," Vol. 26, No. 2, June 1979, pp. 285-290. 
SHANTHIKUMAR, J.G., "On a Single-Server Queue With State-Dependent Service," Vol. 26, No. 2, June 1979, pp. 

305-309. 
SHIMSHAK, D.G., "A Comparison of Waiting Time Approximations in Series Queueing Systems, Vol. 26, No. 3, Sept. 

1979, pp. 499-509. 
SHOGAN, A.W., "A Single Server Queue with Arrival Rate Dependent on Server Breakdowns," Vol. 26, No. 3, Sept. 

1979, pp. 487-497. 
SIEGMUND, D., "Confidence Intervals Related to Sequential Test for the Exponential Distribution," Vol. 26, No. 1, 

Mar. 1979, pp. 57-67. 
SILVER, E.A., "Coordinated Replenishments of Items Under Time-Varying Demand: Dynamic Programming Formula- 
tion," Vol. 26, No. 1, Mar. 1979, pp. 141-151. ' 
SUBELMAN, E.J., "Optimal Betting Strategies for Favorable Games," Vol. 26, No. 2, June 1979, pp. 355-363. 
TAMIR, A., "Scheduling Jobs to Two Machines Subject to Batch Arrival Ordering, Vol. 26, No. 3, Sept. 1979, pp. 

521-525. 
TAYLOR J.G., "Some Simple Victory-Prediction Conditions for Lanchester-Type Combat Between Two Homogeneous 

Forces With Supporting Fire," Vol. 26, No. 2, June 1979, pp. 365-375. 
THIAGARAJAN, T.R. and CM. Harris, "Statistical Tests for Exponential Services from M/G/1 Waiting-Time Data," 

Vol. 26, No. 3, Sept. 1979, pp. 511-520. 
WAGNER, H.M., "The Next Decade of Logistics Research," Vol. 26, No. 3, Sept. 1979, pp. 377-392. 
WEISS, L., "The Asymptotic Distribution of Order Statistics," Vol. 26, No. 3, Sept. 1979, pp. 437-445. 
WHITE, C.C, III, "Bounds on Optimal Cost for a Replacement Problem with Partial Observations," Vol. 26, No. 3, 

Sept. 1979, pp. 415-422. 
ZACKS, S., "Survival Distributions in Crossing Fields Containing Clusters of Mines with Possible Detection and Unc- 
ertain Activation or Kill," Vol. 26, No. 3, Sept. 1979, pp. 423-435. 
ZUCKERMAN, D., "A Diffusion Model for the Control of a Multipurpose Reservoir System," Vol. 26, No. 4, Dec. 

1979, pp. 579-586 



irU.S. GOVERNMENT PRINTING OFFICE; 1979 O — 305-622 



i 



INFORMATION FOR CONTRIBUTORS 

The NAVAL RESEARCH LOGISTICS QUARTERLY is devoted to the dissemination of 
scientific information in logistics and will publish research and expository papers, including those 
in certain areas of mathematics, statistics, and economics, relevant to the over-all effort to improve 
the efficiency and effectiveness of logistics operations. 

Manuscripts and other items for publication should be sent to The Managing Editor, NAVAL 
RESEARCH LOGISTICS QUARTERLY, Office of Naval Research, Arlington, Va. 22217. 
Each manuscript which is considered to be suitable material tor the QUARTERLY is sent to one 
or more referees. 

Manuscripts submitted for publication should be typewritten, double-spaced, and the author 
should retain a copy. Refereeing may be expedited if an extra copy of the manuscript is submitted 
with the original. 

A short abstract (not over 400 words) should accompany each manuscript. This will appear 
at the head of the published paper in the QUARTERLY. 

There is no authorization for compensation to authors for papers which have been accepted 
for publication. Authors will receive 250 reprints of their published papers. 

Readers are invited to submit to the Managing Editor items of general interest in the held 
of logistics, for possible publication in the NEWS AND MEMORANDA or NOTES sections 
of the QUARTERLY. 



NAVAL RESEARCH 

LOGISTICS 

QUARTERLY 



DECEMBER 1979 
VOL. 26, NO. 4 

NAVSO P-1278 



CONTENTS 



ARTICLES 

Optimal Set Partitioning, M atchings and 
Lagrangian Duality 

A Complete Importance Ranking for Components 
of Binary Coherent Systems, With Extensions 
to Multi-State Systems 

A Diffusion Model for the Control of 
A Multipurpose Reservoir System 

Computation Techniques for Large Scale 

Undiscounted Markov Decision Processes 

An Algorithm (GIPC2) for Solving Integer 
Programming Problems With Separable 
Nonlinear Objective Functions 

Duality for Quasi-Concave Programs 
With Application to Economics 

On the Existence of Joint Production Functions 

The Pure Fixed Charge Transportation Problem 

A Heuristic Routine for Solving Large 
Loading Problems 

The Search for an Intelligent Evader Concealed 
in One of an Arbitrary Number of Regions 

The Queueing System MVG/1 and 
its Ramifications 

On the Moments of Gamma Order Statistics 

A New Storage Reduction Technique for the 
Solution of the Group Problem 

Experiments With Linear Fractional Problems - 

The Sensitivity of First Term Navy Reenlistment to 
Changes in Unemployment and Relative Wages 

Index 





Page 


G. L. NEMHAUSER 
G. M. WEBER 


553 


D. A. BUTLER 


565 


D. ZUCKERMAN 


579 


T. J. HODGSON 
G. J. KOEHLER 


587 


C. D. PEGDEN 
C. C. PETERSEN 


595 


T. R. JEFFERSON 
G. M. FOLIE 
C. H. SCOTT 


611 


R. AL-AYAT 
R. FARE 


627 


J. FISK 
P. MCKEOWN 


631 


J. C. FISK 
M. S. HUNG 


643 


J. C. GITTINS 
D. M. ROBERTS 


651 


M. L. CHAUDHRY 


667 


P. C. JOSHI 


675 


R. V. HELGASON 
L. KENNINGTON 


681 


G. R. BITRAN 


689 


L. COHEN 
D. E. REEDY 


695 



711 



OFFICE OF NAVAL RESEARCH 

Arlington, Va. 22217